All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support
@ 2017-10-05 19:17 ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Hello,

The aim of this series is to enable IESB and add ESB-instructions to let us
kick any pending RAS errors into firmware to be handled by firmware-first.

Not all systems will have this firmware, so these RAS errors will become
pending SErrors. We should take these as quickly as possible and avoid
panic()ing for errors where we could have continued.

This first part of this series reworks the DAIF masking so that SError is
unmasked unless we are handling a debug exception.

The last part provides the same minimal handling for SError that interrupt
KVM. KVM is currently unable to handle SErrors during world-switch, unless
they occur during a magic single-instruction window, it hyp-panics. I suspect
this will be easier to fix once the VHE world-switch is further optimised.

KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.

KVM's existing 'impdef SError to the guest' behaviour probably needs revisiting.
These are errors where we don't know what they mean, they may not be
synchronised by ESB. Today we blame the guest.
My half-baked suggestion would be to make a virtual SError pending, but then
exit to user-space to give Qemu the change to quit (for virtual machines that
don't generate SError), pend an SError with a new Qemu-specific ESR, or blindly
continue and take KVMs default all-zeros impdef ESR.

Known issues:
 * Synchronous external abort SET severity is not yet considered, all
   synchronous-external-aborts are still considered fatal.

 * KVM-Migration: VDISR_EL2 is exposed to userspace as DISR_EL1, but how should
   HCR_EL2.VSE or VSESR_EL2 be migrated when the guest has an SError pending but
   hasn't taken it yet...?

 * No HCR_EL2.{TEA/TERR} setting ... Dongjiu Geng had a patch that was almost
   finished, I haven't seen the new version.

 * KVM unmasks SError and IRQ before calling handle_exit, we may be rescheduled
   while holding an uncontained ESR... (this is currently an improvement on
   assuming its an impdef error we can blame on the guest)
    * We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
      SError handling will need to move to before kvm_arm_vhe_guest_exit().


Changes from v2 ... (where do I start?)
 * All the KVM patches rewritten.
 * VSESR_EL2 setting/save/restore is new, as is
 * save/restoring VDISR_EL2 and exposing it to user space as DISR_EL1.

 * The new ARM-ARM (DDI0487B.b) has an SCTLR_EL2.IESB even for !VHE, we turn
   that on.
 * 'survivable' SError are now described as 'blocking' because the CPU can't
   make progress, this makes all the commit messages clearer.
 * My IESB!=ESB confusion got fixed, so the crazy eret with SError unmasked
   is gone, never to return.
 * The cost of masking SError on return to user-space has been wrapped up with
   the ret-to-user loop. (This was only visible with microbenchmarks like
   getpid)
 * entry.S changes got cleaner, commit messages got better,


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v3


Comments and contradictions welcome,

James Morse (18):
  arm64: explicitly mask all exceptions
  arm64: introduce an order for exceptions
  arm64: Move the async/fiq helpers to explicitly set process context
    flags
  arm64: Mask all exceptions during kernel_exit
  arm64: entry.S: Remove disable_dbg
  arm64: entry.S: convert el1_sync
  arm64: entry.S convert el0_sync
  arm64: entry.S: convert elX_irq
  KVM: arm/arm64: mask/unmask daif around VHE guests
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: cpufeature: Enable IESB on exception entry/return for
    firmware-first
  arm64: kernel: Prepare for a DISR user
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  KVM: arm64: Take any host SError before entering the guest

Xie XiuQi (2):
  arm64: entry.S: move SError handling into a C function for future
    expansion
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm64/Kconfig                   | 33 +++++++++++++-
 arch/arm64/include/asm/assembler.h   | 50 ++++++++++++++-------
 arch/arm64/include/asm/barrier.h     |  1 +
 arch/arm64/include/asm/cpucaps.h     |  4 +-
 arch/arm64/include/asm/daifflags.h   | 61 +++++++++++++++++++++++++
 arch/arm64/include/asm/esr.h         | 17 +++++++
 arch/arm64/include/asm/exception.h   | 14 ++++++
 arch/arm64/include/asm/irqflags.h    | 40 ++++++-----------
 arch/arm64/include/asm/kvm_emulate.h | 10 +++++
 arch/arm64/include/asm/kvm_host.h    | 16 +++++++
 arch/arm64/include/asm/processor.h   |  2 +
 arch/arm64/include/asm/sysreg.h      |  6 +++
 arch/arm64/include/asm/traps.h       | 36 +++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/cpufeature.c       | 43 ++++++++++++++++++
 arch/arm64/kernel/debug-monitors.c   |  5 ++-
 arch/arm64/kernel/entry.S            | 86 +++++++++++++++++++++---------------
 arch/arm64/kernel/hibernate.c        |  5 ++-
 arch/arm64/kernel/machine_kexec.c    |  4 +-
 arch/arm64/kernel/process.c          |  3 ++
 arch/arm64/kernel/setup.c            |  8 ++--
 arch/arm64/kernel/signal.c           |  8 +++-
 arch/arm64/kernel/smp.c              | 12 ++---
 arch/arm64/kernel/suspend.c          |  7 +--
 arch/arm64/kernel/traps.c            | 64 ++++++++++++++++++++++++++-
 arch/arm64/kvm/handle_exit.c         | 19 +++++++-
 arch/arm64/kvm/hyp-init.S            |  3 ++
 arch/arm64/kvm/hyp/entry.S           | 13 ++++++
 arch/arm64/kvm/hyp/switch.c          | 19 ++++++--
 arch/arm64/kvm/hyp/sysreg-sr.c       |  6 +++
 arch/arm64/kvm/inject_fault.c        | 13 +++++-
 arch/arm64/kvm/sys_regs.c            |  1 +
 arch/arm64/mm/proc.S                 | 14 +++---
 virt/kvm/arm/arm.c                   |  4 ++
 34 files changed, 513 insertions(+), 115 deletions(-)
 create mode 100644 arch/arm64/include/asm/daifflags.h

-- 
2.13.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support
@ 2017-10-05 19:17 ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

The aim of this series is to enable IESB and add ESB-instructions to let us
kick any pending RAS errors into firmware to be handled by firmware-first.

Not all systems will have this firmware, so these RAS errors will become
pending SErrors. We should take these as quickly as possible and avoid
panic()ing for errors where we could have continued.

This first part of this series reworks the DAIF masking so that SError is
unmasked unless we are handling a debug exception.

The last part provides the same minimal handling for SError that interrupt
KVM. KVM is currently unable to handle SErrors during world-switch, unless
they occur during a magic single-instruction window, it hyp-panics. I suspect
this will be easier to fix once the VHE world-switch is further optimised.

KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.

KVM's existing 'impdef SError to the guest' behaviour probably needs revisiting.
These are errors where we don't know what they mean, they may not be
synchronised by ESB. Today we blame the guest.
My half-baked suggestion would be to make a virtual SError pending, but then
exit to user-space to give Qemu the change to quit (for virtual machines that
don't generate SError), pend an SError with a new Qemu-specific ESR, or blindly
continue and take KVMs default all-zeros impdef ESR.

Known issues:
 * Synchronous external abort SET severity is not yet considered, all
   synchronous-external-aborts are still considered fatal.

 * KVM-Migration: VDISR_EL2 is exposed to userspace as DISR_EL1, but how should
   HCR_EL2.VSE or VSESR_EL2 be migrated when the guest has an SError pending but
   hasn't taken it yet...?

 * No HCR_EL2.{TEA/TERR} setting ... Dongjiu Geng had a patch that was almost
   finished, I haven't seen the new version.

 * KVM unmasks SError and IRQ before calling handle_exit, we may be rescheduled
   while holding an uncontained ESR... (this is currently an improvement on
   assuming its an impdef error we can blame on the guest)
    * We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
      SError handling will need to move to before kvm_arm_vhe_guest_exit().


Changes from v2 ... (where do I start?)
 * All the KVM patches rewritten.
 * VSESR_EL2 setting/save/restore is new, as is
 * save/restoring VDISR_EL2 and exposing it to user space as DISR_EL1.

 * The new ARM-ARM (DDI0487B.b) has an SCTLR_EL2.IESB even for !VHE, we turn
   that on.
 * 'survivable' SError are now described as 'blocking' because the CPU can't
   make progress, this makes all the commit messages clearer.
 * My IESB!=ESB confusion got fixed, so the crazy eret with SError unmasked
   is gone, never to return.
 * The cost of masking SError on return to user-space has been wrapped up with
   the ret-to-user loop. (This was only visible with microbenchmarks like
   getpid)
 * entry.S changes got cleaner, commit messages got better,


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v3


Comments and contradictions welcome,

James Morse (18):
  arm64: explicitly mask all exceptions
  arm64: introduce an order for exceptions
  arm64: Move the async/fiq helpers to explicitly set process context
    flags
  arm64: Mask all exceptions during kernel_exit
  arm64: entry.S: Remove disable_dbg
  arm64: entry.S: convert el1_sync
  arm64: entry.S convert el0_sync
  arm64: entry.S: convert elX_irq
  KVM: arm/arm64: mask/unmask daif around VHE guests
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: cpufeature: Enable IESB on exception entry/return for
    firmware-first
  arm64: kernel: Prepare for a DISR user
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  KVM: arm64: Take any host SError before entering the guest

Xie XiuQi (2):
  arm64: entry.S: move SError handling into a C function for future
    expansion
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm64/Kconfig                   | 33 +++++++++++++-
 arch/arm64/include/asm/assembler.h   | 50 ++++++++++++++-------
 arch/arm64/include/asm/barrier.h     |  1 +
 arch/arm64/include/asm/cpucaps.h     |  4 +-
 arch/arm64/include/asm/daifflags.h   | 61 +++++++++++++++++++++++++
 arch/arm64/include/asm/esr.h         | 17 +++++++
 arch/arm64/include/asm/exception.h   | 14 ++++++
 arch/arm64/include/asm/irqflags.h    | 40 ++++++-----------
 arch/arm64/include/asm/kvm_emulate.h | 10 +++++
 arch/arm64/include/asm/kvm_host.h    | 16 +++++++
 arch/arm64/include/asm/processor.h   |  2 +
 arch/arm64/include/asm/sysreg.h      |  6 +++
 arch/arm64/include/asm/traps.h       | 36 +++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/cpufeature.c       | 43 ++++++++++++++++++
 arch/arm64/kernel/debug-monitors.c   |  5 ++-
 arch/arm64/kernel/entry.S            | 86 +++++++++++++++++++++---------------
 arch/arm64/kernel/hibernate.c        |  5 ++-
 arch/arm64/kernel/machine_kexec.c    |  4 +-
 arch/arm64/kernel/process.c          |  3 ++
 arch/arm64/kernel/setup.c            |  8 ++--
 arch/arm64/kernel/signal.c           |  8 +++-
 arch/arm64/kernel/smp.c              | 12 ++---
 arch/arm64/kernel/suspend.c          |  7 +--
 arch/arm64/kernel/traps.c            | 64 ++++++++++++++++++++++++++-
 arch/arm64/kvm/handle_exit.c         | 19 +++++++-
 arch/arm64/kvm/hyp-init.S            |  3 ++
 arch/arm64/kvm/hyp/entry.S           | 13 ++++++
 arch/arm64/kvm/hyp/switch.c          | 19 ++++++--
 arch/arm64/kvm/hyp/sysreg-sr.c       |  6 +++
 arch/arm64/kvm/inject_fault.c        | 13 +++++-
 arch/arm64/kvm/sys_regs.c            |  1 +
 arch/arm64/mm/proc.S                 | 14 +++---
 virt/kvm/arm/arm.c                   |  4 ++
 34 files changed, 513 insertions(+), 115 deletions(-)
 create mode 100644 arch/arm64/include/asm/daifflags.h

-- 
2.13.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

There are a few places where we want to mask all exceptions. Today we
do this in a piecemeal fashion, typically we expect the caller to
have masked irqs and the arch code masks debug exceptions, ignoring
SError which is probably masked.

Make it clear that 'mask all exceptions' is the intention by adding
helpers to do exactly that.

This will let us unmask SError without having to add 'oh and SError'
to these paths.

Signed-off-by: James Morse <james.morse@arm.com>

---
Remove the disable IRQs comment above cpu_die(): we return from idle via
cpu_resume. This comment is confusing once the local_irq_disable() call
is changed.

 arch/arm64/include/asm/assembler.h | 17 +++++++++++
 arch/arm64/include/asm/daifflags.h | 59 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hibernate.c      |  5 ++--
 arch/arm64/kernel/machine_kexec.c  |  4 +--
 arch/arm64/kernel/smp.c            |  9 ++----
 arch/arm64/kernel/suspend.c        |  7 +++--
 arch/arm64/kernel/traps.c          |  3 +-
 arch/arm64/mm/proc.S               |  9 +++---
 8 files changed, 94 insertions(+), 19 deletions(-)
 create mode 100644 arch/arm64/include/asm/daifflags.h

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d58a6253c6ab..114e741ca873 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -31,6 +31,23 @@
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
+	.macro save_and_disable_daif, flags
+	mrs	\flags, daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro disable_daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro enable_daif
+	msr	daifclr, #0xf
+	.endm
+
+	.macro	restore_daif, flags:req
+	msr	daif, \flags
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
new file mode 100644
index 000000000000..fb40da8e1457
--- /dev/null
+++ b/arch/arm64/include/asm/daifflags.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_DAIFFLAGS_H
+#define __ASM_DAIFFLAGS_H
+
+#include <asm/irqflags.h>
+#include <linux/irqflags.h>
+
+/* Mask/unmask/restore all exceptions, including interrupts. */
+static inline unsigned long local_mask_daif(void)
+{
+	unsigned long flags;
+	asm volatile(
+		"mrs	%0, daif		// local_mask_daif\n"
+		"msr	daifset, #0xf"
+		: "=r" (flags)
+		:
+		: "memory");
+	trace_hardirqs_off();
+	return flags;
+}
+
+static inline void local_unmask_daif(void)
+{
+	trace_hardirqs_on();
+	asm volatile(
+		"msr	daifclr, #0xf		// local_unmask_daif"
+		:
+		:
+		: "memory");
+}
+
+static inline void local_restore_daif(unsigned long flags)
+{
+	if (!arch_irqs_disabled_flags(flags))
+		trace_hardirqs_on();
+	asm volatile(
+		"msr	daif, %0		// local_restore_daif"
+	:
+	: "r" (flags)
+	: "memory");
+	if (arch_irqs_disabled_flags(flags))
+		trace_hardirqs_off();
+}
+
+#endif
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 095d3c170f5d..0b95aab3b080 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -27,6 +27,7 @@
 #include <asm/barrier.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/irqflags.h>
 #include <asm/kexec.h>
 #include <asm/memory.h>
@@ -285,7 +286,7 @@ int swsusp_arch_suspend(void)
 		return -EBUSY;
 	}
 
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	if (__cpu_suspend_enter(&state)) {
 		/* make the crash dump kernel image visible/saveable */
@@ -315,7 +316,7 @@ int swsusp_arch_suspend(void)
 		__cpu_suspend_exit();
 	}
 
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 11121f608eb5..7efcb0dcc61b 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -18,6 +18,7 @@
 
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/memory.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
@@ -195,8 +196,7 @@ void machine_kexec(struct kimage *kimage)
 
 	pr_info("Bye!\n");
 
-	/* Disable all DAIF exceptions. */
-	asm volatile ("msr daifset, #0xf" : : : "memory");
+	local_mask_daif();
 
 	/*
 	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 9f7195a5773e..c4116f2af43f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -47,6 +47,7 @@
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/mmu_context.h>
 #include <asm/numa.h>
 #include <asm/pgtable.h>
@@ -368,10 +369,6 @@ void __cpu_die(unsigned int cpu)
 /*
  * Called from the idle thread for the CPU which has been shutdown.
  *
- * Note that we disable IRQs here, but do not re-enable them
- * before returning to the caller. This is also the behaviour
- * of the other hotplug-cpu capable cores, so presumably coming
- * out of idle fixes this.
  */
 void cpu_die(void)
 {
@@ -379,7 +376,7 @@ void cpu_die(void)
 
 	idle_task_exit();
 
-	local_irq_disable();
+	local_mask_daif();
 
 	/* Tell __cpu_die() that this CPU is now safe to dispose of */
 	(void)cpu_report_death();
@@ -837,7 +834,7 @@ static void ipi_cpu_stop(unsigned int cpu)
 {
 	set_cpu_online(cpu, false);
 
-	local_irq_disable();
+	local_mask_daif();
 
 	while (1)
 		cpu_relax();
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1e3be9064cfa..f90ee16b0160 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -4,6 +4,7 @@
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/exec.h>
 #include <asm/pgtable.h>
@@ -57,7 +58,7 @@ void notrace __cpu_suspend_exit(void)
 	/*
 	 * Restore HW breakpoint registers to sane values
 	 * before debug exceptions are possibly reenabled
-	 * through local_dbg_restore.
+	 * by cpu_suspend()s local_daif_restore() call.
 	 */
 	if (hw_breakpoint_restore)
 		hw_breakpoint_restore(cpu);
@@ -81,7 +82,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * updates to mdscr register (saved and restored along with
 	 * general purpose registers) from kernel debuggers.
 	 */
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	/*
 	 * Function graph tracer state gets incosistent when the kernel
@@ -114,7 +115,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * restored, so from this point onwards, debugging is fully
 	 * renabled if it was enabled when core started shutdown.
 	 */
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 5ea4b85aee0e..d046495f67eb 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -38,6 +38,7 @@
 
 #include <asm/atomic.h>
 #include <asm/bug.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/esr.h>
 #include <asm/insn.h>
@@ -642,7 +643,7 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
 		esr_get_class_string(esr));
 
 	die("Oops - bad mode", regs, 0);
-	local_irq_disable();
+	local_mask_daif();
 	panic("bad mode");
 }
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 877d42fb0df6..95233dfc4c39 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -109,10 +109,10 @@ ENTRY(cpu_do_resume)
 	/*
 	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
 	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_dbg_restore() in cpu_suspend()
+	 * exception. Mask them until local_daif_restore() in cpu_suspend()
 	 * resets them.
 	 */
-	disable_dbg
+	disable_daif
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
@@ -155,8 +155,7 @@ ENDPROC(cpu_do_switch_mm)
  * called by anything else. It can only be executed from a TTBR0 mapping.
  */
 ENTRY(idmap_cpu_replace_ttbr1)
-	mrs	x2, daif
-	msr	daifset, #0xf
+	save_and_disable_daif flags=x2
 
 	adrp	x1, empty_zero_page
 	msr	ttbr1_el1, x1
@@ -169,7 +168,7 @@ ENTRY(idmap_cpu_replace_ttbr1)
 	msr	ttbr1_el1, x0
 	isb
 
-	msr	daif, x2
+	restore_daif x2
 
 	ret
 ENDPROC(idmap_cpu_replace_ttbr1)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

There are a few places where we want to mask all exceptions. Today we
do this in a piecemeal fashion, typically we expect the caller to
have masked irqs and the arch code masks debug exceptions, ignoring
SError which is probably masked.

Make it clear that 'mask all exceptions' is the intention by adding
helpers to do exactly that.

This will let us unmask SError without having to add 'oh and SError'
to these paths.

Signed-off-by: James Morse <james.morse@arm.com>

---
Remove the disable IRQs comment above cpu_die(): we return from idle via
cpu_resume. This comment is confusing once the local_irq_disable() call
is changed.

 arch/arm64/include/asm/assembler.h | 17 +++++++++++
 arch/arm64/include/asm/daifflags.h | 59 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hibernate.c      |  5 ++--
 arch/arm64/kernel/machine_kexec.c  |  4 +--
 arch/arm64/kernel/smp.c            |  9 ++----
 arch/arm64/kernel/suspend.c        |  7 +++--
 arch/arm64/kernel/traps.c          |  3 +-
 arch/arm64/mm/proc.S               |  9 +++---
 8 files changed, 94 insertions(+), 19 deletions(-)
 create mode 100644 arch/arm64/include/asm/daifflags.h

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d58a6253c6ab..114e741ca873 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -31,6 +31,23 @@
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
+	.macro save_and_disable_daif, flags
+	mrs	\flags, daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro disable_daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro enable_daif
+	msr	daifclr, #0xf
+	.endm
+
+	.macro	restore_daif, flags:req
+	msr	daif, \flags
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
new file mode 100644
index 000000000000..fb40da8e1457
--- /dev/null
+++ b/arch/arm64/include/asm/daifflags.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_DAIFFLAGS_H
+#define __ASM_DAIFFLAGS_H
+
+#include <asm/irqflags.h>
+#include <linux/irqflags.h>
+
+/* Mask/unmask/restore all exceptions, including interrupts. */
+static inline unsigned long local_mask_daif(void)
+{
+	unsigned long flags;
+	asm volatile(
+		"mrs	%0, daif		// local_mask_daif\n"
+		"msr	daifset, #0xf"
+		: "=r" (flags)
+		:
+		: "memory");
+	trace_hardirqs_off();
+	return flags;
+}
+
+static inline void local_unmask_daif(void)
+{
+	trace_hardirqs_on();
+	asm volatile(
+		"msr	daifclr, #0xf		// local_unmask_daif"
+		:
+		:
+		: "memory");
+}
+
+static inline void local_restore_daif(unsigned long flags)
+{
+	if (!arch_irqs_disabled_flags(flags))
+		trace_hardirqs_on();
+	asm volatile(
+		"msr	daif, %0		// local_restore_daif"
+	:
+	: "r" (flags)
+	: "memory");
+	if (arch_irqs_disabled_flags(flags))
+		trace_hardirqs_off();
+}
+
+#endif
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 095d3c170f5d..0b95aab3b080 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -27,6 +27,7 @@
 #include <asm/barrier.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/irqflags.h>
 #include <asm/kexec.h>
 #include <asm/memory.h>
@@ -285,7 +286,7 @@ int swsusp_arch_suspend(void)
 		return -EBUSY;
 	}
 
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	if (__cpu_suspend_enter(&state)) {
 		/* make the crash dump kernel image visible/saveable */
@@ -315,7 +316,7 @@ int swsusp_arch_suspend(void)
 		__cpu_suspend_exit();
 	}
 
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 11121f608eb5..7efcb0dcc61b 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -18,6 +18,7 @@
 
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/memory.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
@@ -195,8 +196,7 @@ void machine_kexec(struct kimage *kimage)
 
 	pr_info("Bye!\n");
 
-	/* Disable all DAIF exceptions. */
-	asm volatile ("msr daifset, #0xf" : : : "memory");
+	local_mask_daif();
 
 	/*
 	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 9f7195a5773e..c4116f2af43f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -47,6 +47,7 @@
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/mmu_context.h>
 #include <asm/numa.h>
 #include <asm/pgtable.h>
@@ -368,10 +369,6 @@ void __cpu_die(unsigned int cpu)
 /*
  * Called from the idle thread for the CPU which has been shutdown.
  *
- * Note that we disable IRQs here, but do not re-enable them
- * before returning to the caller. This is also the behaviour
- * of the other hotplug-cpu capable cores, so presumably coming
- * out of idle fixes this.
  */
 void cpu_die(void)
 {
@@ -379,7 +376,7 @@ void cpu_die(void)
 
 	idle_task_exit();
 
-	local_irq_disable();
+	local_mask_daif();
 
 	/* Tell __cpu_die() that this CPU is now safe to dispose of */
 	(void)cpu_report_death();
@@ -837,7 +834,7 @@ static void ipi_cpu_stop(unsigned int cpu)
 {
 	set_cpu_online(cpu, false);
 
-	local_irq_disable();
+	local_mask_daif();
 
 	while (1)
 		cpu_relax();
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1e3be9064cfa..f90ee16b0160 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -4,6 +4,7 @@
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/exec.h>
 #include <asm/pgtable.h>
@@ -57,7 +58,7 @@ void notrace __cpu_suspend_exit(void)
 	/*
 	 * Restore HW breakpoint registers to sane values
 	 * before debug exceptions are possibly reenabled
-	 * through local_dbg_restore.
+	 * by cpu_suspend()s local_daif_restore() call.
 	 */
 	if (hw_breakpoint_restore)
 		hw_breakpoint_restore(cpu);
@@ -81,7 +82,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * updates to mdscr register (saved and restored along with
 	 * general purpose registers) from kernel debuggers.
 	 */
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	/*
 	 * Function graph tracer state gets incosistent when the kernel
@@ -114,7 +115,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * restored, so from this point onwards, debugging is fully
 	 * renabled if it was enabled when core started shutdown.
 	 */
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 5ea4b85aee0e..d046495f67eb 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -38,6 +38,7 @@
 
 #include <asm/atomic.h>
 #include <asm/bug.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/esr.h>
 #include <asm/insn.h>
@@ -642,7 +643,7 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
 		esr_get_class_string(esr));
 
 	die("Oops - bad mode", regs, 0);
-	local_irq_disable();
+	local_mask_daif();
 	panic("bad mode");
 }
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 877d42fb0df6..95233dfc4c39 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -109,10 +109,10 @@ ENTRY(cpu_do_resume)
 	/*
 	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
 	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_dbg_restore() in cpu_suspend()
+	 * exception. Mask them until local_daif_restore() in cpu_suspend()
 	 * resets them.
 	 */
-	disable_dbg
+	disable_daif
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
@@ -155,8 +155,7 @@ ENDPROC(cpu_do_switch_mm)
  * called by anything else. It can only be executed from a TTBR0 mapping.
  */
 ENTRY(idmap_cpu_replace_ttbr1)
-	mrs	x2, daif
-	msr	daifset, #0xf
+	save_and_disable_daif flags=x2
 
 	adrp	x1, empty_zero_page
 	msr	ttbr1_el1, x1
@@ -169,7 +168,7 @@ ENTRY(idmap_cpu_replace_ttbr1)
 	msr	ttbr1_el1, x0
 	isb
 
-	msr	daif, x2
+	restore_daif x2
 
 	ret
 ENDPROC(idmap_cpu_replace_ttbr1)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 02/20] arm64: introduce an order for exceptions
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Currently SError is always masked in the kernel. To support RAS exceptions
using SError on hardware with the v8.2 RAS Extensions we need to unmask
SError as much as possible.

Let's define an order for masking and unmasking exceptions. 'dai' is
memorable and effectively what we have today.

Disabling debug exceptions should cause all other exceptions to be masked.
Masking SError should mask irq, but not disable debug exceptions.
Masking irqs has no side effects for other flags. Keeping to this order
makes it easier for entry.S to know which exceptions should be unmasked.

FIQ is never expected, but we mask it when we mask debug exceptions, and
unmask it at all other times.

Given masking debug exceptions masks everything, we don't need macros
to save/restore that bit independently. Remove them and switch the last
caller over to {en,dis}able_daif().

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h  | 34 +++++++++++++---------------------
 arch/arm64/kernel/debug-monitors.c |  5 +++--
 2 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c581281fa12..9ecdca7011f0 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -21,6 +21,19 @@
 #include <asm/ptrace.h>
 
 /*
+ * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
+ * order:
+ * Masking debug exceptions causes all other exceptions to be masked too/
+ * Masking SError masks irq, but not debug exceptions. Masking irqs has no
+ * side effects for other flags. Keeping to this order makes it easier for
+ * entry.S to know which exceptions should be unmasked.
+ *
+ * FIQ is never expected, but we mask it when we disable debug exceptions, and
+ * unmask it at all other times.
+ */
+
+/*
  * CPU interrupt mask handling.
  */
 static inline unsigned long arch_local_irq_save(void)
@@ -89,26 +102,5 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 {
 	return flags & PSR_I_BIT;
 }
-
-/*
- * save and restore debug state
- */
-#define local_dbg_save(flags)						\
-	do {								\
-		typecheck(unsigned long, flags);			\
-		asm volatile(						\
-		"mrs    %0, daif		// local_dbg_save\n"	\
-		"msr    daifset, #8"					\
-		: "=r" (flags) : : "memory");				\
-	} while (0)
-
-#define local_dbg_restore(flags)					\
-	do {								\
-		typecheck(unsigned long, flags);			\
-		asm volatile(						\
-		"msr    daif, %0		// local_dbg_restore\n"	\
-		: : "r" (flags) : "memory");				\
-	} while (0)
-
 #endif
 #endif
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index c7ef99904934..bbfa4978b657 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -30,6 +30,7 @@
 
 #include <asm/cpufeature.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/system_misc.h>
 
@@ -46,9 +47,9 @@ u8 debug_monitors_arch(void)
 static void mdscr_write(u32 mdscr)
 {
 	unsigned long flags;
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 	write_sysreg(mdscr, mdscr_el1);
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 }
 NOKPROBE_SYMBOL(mdscr_write);
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 02/20] arm64: introduce an order for exceptions
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

Currently SError is always masked in the kernel. To support RAS exceptions
using SError on hardware with the v8.2 RAS Extensions we need to unmask
SError as much as possible.

Let's define an order for masking and unmasking exceptions. 'dai' is
memorable and effectively what we have today.

Disabling debug exceptions should cause all other exceptions to be masked.
Masking SError should mask irq, but not disable debug exceptions.
Masking irqs has no side effects for other flags. Keeping to this order
makes it easier for entry.S to know which exceptions should be unmasked.

FIQ is never expected, but we mask it when we mask debug exceptions, and
unmask it at all other times.

Given masking debug exceptions masks everything, we don't need macros
to save/restore that bit independently. Remove them and switch the last
caller over to {en,dis}able_daif().

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h  | 34 +++++++++++++---------------------
 arch/arm64/kernel/debug-monitors.c |  5 +++--
 2 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c581281fa12..9ecdca7011f0 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -21,6 +21,19 @@
 #include <asm/ptrace.h>
 
 /*
+ * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
+ * order:
+ * Masking debug exceptions causes all other exceptions to be masked too/
+ * Masking SError masks irq, but not debug exceptions. Masking irqs has no
+ * side effects for other flags. Keeping to this order makes it easier for
+ * entry.S to know which exceptions should be unmasked.
+ *
+ * FIQ is never expected, but we mask it when we disable debug exceptions, and
+ * unmask it at all other times.
+ */
+
+/*
  * CPU interrupt mask handling.
  */
 static inline unsigned long arch_local_irq_save(void)
@@ -89,26 +102,5 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 {
 	return flags & PSR_I_BIT;
 }
-
-/*
- * save and restore debug state
- */
-#define local_dbg_save(flags)						\
-	do {								\
-		typecheck(unsigned long, flags);			\
-		asm volatile(						\
-		"mrs    %0, daif		// local_dbg_save\n"	\
-		"msr    daifset, #8"					\
-		: "=r" (flags) : : "memory");				\
-	} while (0)
-
-#define local_dbg_restore(flags)					\
-	do {								\
-		typecheck(unsigned long, flags);			\
-		asm volatile(						\
-		"msr    daif, %0		// local_dbg_restore\n"	\
-		: : "r" (flags) : "memory");				\
-	} while (0)
-
 #endif
 #endif
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index c7ef99904934..bbfa4978b657 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -30,6 +30,7 @@
 
 #include <asm/cpufeature.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/system_misc.h>
 
@@ -46,9 +47,9 @@ u8 debug_monitors_arch(void)
 static void mdscr_write(u32 mdscr)
 {
 	unsigned long flags;
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 	write_sysreg(mdscr, mdscr_el1);
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 }
 NOKPROBE_SYMBOL(mdscr_write);
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 03/20] arm64: Move the async/fiq helpers to explicitly set process context flags
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Remove the local_{async,fiq}_{en,dis}able macros as they don't respect
our newly defined order and are only used to set the flags for process
context when we bring CPUs online.

Add a helper to do this. The IRQ flag varies as we want it masked on
the boot CPU until we are ready to handle interrupts.
The boot CPU unmasks SError during early boot once it can print an error
message. If we can print an error message about SError, we can do the
same for FIQ. Debug exceptions are already enabled by __cpu_setup(),
which has also configured MDSCR_EL1 to disable MDE and KDE.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/daifflags.h | 4 +++-
 arch/arm64/include/asm/irqflags.h  | 6 ------
 arch/arm64/kernel/setup.c          | 8 +++++---
 arch/arm64/kernel/smp.c            | 3 +--
 4 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index fb40da8e1457..72389644c811 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -19,6 +19,9 @@
 #include <asm/irqflags.h>
 #include <linux/irqflags.h>
 
+#define DAIF_PROCCTX		0
+#define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+
 /* Mask/unmask/restore all exceptions, including interrupts. */
 static inline unsigned long local_mask_daif(void)
 {
@@ -55,5 +58,4 @@ static inline void local_restore_daif(unsigned long flags)
 	if (arch_irqs_disabled_flags(flags))
 		trace_hardirqs_off();
 }
-
 #endif
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 9ecdca7011f0..24692edf1a69 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -66,12 +66,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }
 
-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index d4b740538ad5..3a3f479d8fc8 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -48,6 +48,7 @@
 #include <asm/fixmap.h>
 #include <asm/cpu.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/elf.h>
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
@@ -262,10 +263,11 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	/*
-	 *  Unmask asynchronous aborts after bringing up possible earlycon.
-	 * (Report possible System Errors once we can report this occurred)
+	 * Unmask asynchronous aborts and fiq after bringing up possible
+	 * earlycon. (Report possible System Errors once we can report this
+	 * occurred).
 	 */
-	local_async_enable();
+	local_restore_daif(DAIF_PROCCTX_NOIRQ);
 
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index c4116f2af43f..ced874565610 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -272,8 +272,7 @@ asmlinkage void secondary_start_kernel(void)
 	set_cpu_online(cpu, true);
 	complete(&cpu_running);
 
-	local_irq_enable();
-	local_async_enable();
+	local_restore_daif(DAIF_PROCCTX);
 
 	/*
 	 * OK, it's off to the idle thread for us
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 03/20] arm64: Move the async/fiq helpers to explicitly set process context flags
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

Remove the local_{async,fiq}_{en,dis}able macros as they don't respect
our newly defined order and are only used to set the flags for process
context when we bring CPUs online.

Add a helper to do this. The IRQ flag varies as we want it masked on
the boot CPU until we are ready to handle interrupts.
The boot CPU unmasks SError during early boot once it can print an error
message. If we can print an error message about SError, we can do the
same for FIQ. Debug exceptions are already enabled by __cpu_setup(),
which has also configured MDSCR_EL1 to disable MDE and KDE.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/daifflags.h | 4 +++-
 arch/arm64/include/asm/irqflags.h  | 6 ------
 arch/arm64/kernel/setup.c          | 8 +++++---
 arch/arm64/kernel/smp.c            | 3 +--
 4 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index fb40da8e1457..72389644c811 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -19,6 +19,9 @@
 #include <asm/irqflags.h>
 #include <linux/irqflags.h>
 
+#define DAIF_PROCCTX		0
+#define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+
 /* Mask/unmask/restore all exceptions, including interrupts. */
 static inline unsigned long local_mask_daif(void)
 {
@@ -55,5 +58,4 @@ static inline void local_restore_daif(unsigned long flags)
 	if (arch_irqs_disabled_flags(flags))
 		trace_hardirqs_off();
 }
-
 #endif
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 9ecdca7011f0..24692edf1a69 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -66,12 +66,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }
 
-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index d4b740538ad5..3a3f479d8fc8 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -48,6 +48,7 @@
 #include <asm/fixmap.h>
 #include <asm/cpu.h>
 #include <asm/cputype.h>
+#include <asm/daifflags.h>
 #include <asm/elf.h>
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
@@ -262,10 +263,11 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	/*
-	 *  Unmask asynchronous aborts after bringing up possible earlycon.
-	 * (Report possible System Errors once we can report this occurred)
+	 * Unmask asynchronous aborts and fiq after bringing up possible
+	 * earlycon. (Report possible System Errors once we can report this
+	 * occurred).
 	 */
-	local_async_enable();
+	local_restore_daif(DAIF_PROCCTX_NOIRQ);
 
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index c4116f2af43f..ced874565610 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -272,8 +272,7 @@ asmlinkage void secondary_start_kernel(void)
 	set_cpu_online(cpu, true);
 	complete(&cpu_running);
 
-	local_irq_enable();
-	local_async_enable();
+	local_restore_daif(DAIF_PROCCTX);
 
 	/*
 	 * OK, it's off to the idle thread for us
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 04/20] arm64: Mask all exceptions during kernel_exit
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

To take RAS Exceptions as quickly as possible we need to keep SError
unmasked as much as possible. We need to mask it during kernel_exit
as taking an error from this code will overwrite the exception-registers.

Adding a naked 'disable_daif' to kernel_exit causes a performance problem
for micro-benchmarks that do no real work, (e.g. calling getpid() in a
loop). This is because the ret_to_user loop has already masked IRQs so
that the TIF_WORK_MASK thread flags can't change underneath it, adding
disable_daif is an additional self-synchronising operation.

In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
flags from an SError, in which case the ret_to_user loop must mask SError
while it examines the flags.

Disable all exceptions for return to EL1. For return to EL0 get the
ret_to_user loop to leave all exceptions masked once it has done its
work, this avoids an extra pstate-write.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/entry.S  | 10 +++++-----
 arch/arm64/kernel/signal.c |  8 ++++++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e1c59d4008a8..f7d7bf9d76e7 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -221,6 +221,8 @@ alternative_else_nop_endif
 
 	.macro	kernel_exit, el
 	.if	\el != 0
+	disable_daif
+
 	/* Restore the task's original addr_limit. */
 	ldr	x20, [sp, #S_ORIG_ADDR_LIMIT]
 	str	x20, [tsk, #TSK_TI_ADDR_LIMIT]
@@ -517,8 +519,6 @@ el1_da:
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
-	// disable interrupts before pulling preserved data off the stack
-	disable_irq
 	kernel_exit 1
 el1_sp_pc:
 	/*
@@ -793,7 +793,7 @@ ENDPROC(el0_irq)
  * and this includes saving x0 back into the kernel stack.
  */
 ret_fast_syscall:
-	disable_irq				// disable interrupts
+	disable_daif
 	str	x0, [sp, #S_X0]			// returned x0
 	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
 	and	x2, x1, #_TIF_SYSCALL_WORK
@@ -803,7 +803,7 @@ ret_fast_syscall:
 	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
-	enable_irq				// enable interrupts
+	enable_daif
 	b	__sys_trace_return_skipped	// we already saved x0
 
 /*
@@ -821,7 +821,7 @@ work_pending:
  * "slow" syscall return path.
  */
 ret_to_user:
-	disable_irq				// disable interrupts
+	disable_daif
 	ldr	x1, [tsk, #TSK_TI_FLAGS]
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 0bdc96c61bc0..b1fd14beb044 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -31,6 +31,7 @@
 #include <linux/ratelimit.h>
 #include <linux/syscalls.h>
 
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/elf.h>
 #include <asm/cacheflush.h>
@@ -756,9 +757,12 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 		addr_limit_user_check();
 
 		if (thread_flags & _TIF_NEED_RESCHED) {
+			/* Unmask Debug and SError for the next task */
+			local_restore_daif(DAIF_PROCCTX_NOIRQ);
+
 			schedule();
 		} else {
-			local_irq_enable();
+			local_restore_daif(DAIF_PROCCTX);
 
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
@@ -775,7 +779,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 				fpsimd_restore_current_state();
 		}
 
-		local_irq_disable();
+		local_mask_daif();
 		thread_flags = READ_ONCE(current_thread_info()->flags);
 	} while (thread_flags & _TIF_WORK_MASK);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 04/20] arm64: Mask all exceptions during kernel_exit
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

To take RAS Exceptions as quickly as possible we need to keep SError
unmasked as much as possible. We need to mask it during kernel_exit
as taking an error from this code will overwrite the exception-registers.

Adding a naked 'disable_daif' to kernel_exit causes a performance problem
for micro-benchmarks that do no real work, (e.g. calling getpid() in a
loop). This is because the ret_to_user loop has already masked IRQs so
that the TIF_WORK_MASK thread flags can't change underneath it, adding
disable_daif is an additional self-synchronising operation.

In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
flags from an SError, in which case the ret_to_user loop must mask SError
while it examines the flags.

Disable all exceptions for return to EL1. For return to EL0 get the
ret_to_user loop to leave all exceptions masked once it has done its
work, this avoids an extra pstate-write.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/entry.S  | 10 +++++-----
 arch/arm64/kernel/signal.c |  8 ++++++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e1c59d4008a8..f7d7bf9d76e7 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -221,6 +221,8 @@ alternative_else_nop_endif
 
 	.macro	kernel_exit, el
 	.if	\el != 0
+	disable_daif
+
 	/* Restore the task's original addr_limit. */
 	ldr	x20, [sp, #S_ORIG_ADDR_LIMIT]
 	str	x20, [tsk, #TSK_TI_ADDR_LIMIT]
@@ -517,8 +519,6 @@ el1_da:
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
-	// disable interrupts before pulling preserved data off the stack
-	disable_irq
 	kernel_exit 1
 el1_sp_pc:
 	/*
@@ -793,7 +793,7 @@ ENDPROC(el0_irq)
  * and this includes saving x0 back into the kernel stack.
  */
 ret_fast_syscall:
-	disable_irq				// disable interrupts
+	disable_daif
 	str	x0, [sp, #S_X0]			// returned x0
 	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
 	and	x2, x1, #_TIF_SYSCALL_WORK
@@ -803,7 +803,7 @@ ret_fast_syscall:
 	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
-	enable_irq				// enable interrupts
+	enable_daif
 	b	__sys_trace_return_skipped	// we already saved x0
 
 /*
@@ -821,7 +821,7 @@ work_pending:
  * "slow" syscall return path.
  */
 ret_to_user:
-	disable_irq				// disable interrupts
+	disable_daif
 	ldr	x1, [tsk, #TSK_TI_FLAGS]
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 0bdc96c61bc0..b1fd14beb044 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -31,6 +31,7 @@
 #include <linux/ratelimit.h>
 #include <linux/syscalls.h>
 
+#include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/elf.h>
 #include <asm/cacheflush.h>
@@ -756,9 +757,12 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 		addr_limit_user_check();
 
 		if (thread_flags & _TIF_NEED_RESCHED) {
+			/* Unmask Debug and SError for the next task */
+			local_restore_daif(DAIF_PROCCTX_NOIRQ);
+
 			schedule();
 		} else {
-			local_irq_enable();
+			local_restore_daif(DAIF_PROCCTX);
 
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
@@ -775,7 +779,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 				fpsimd_restore_current_state();
 		}
 
-		local_irq_disable();
+		local_mask_daif();
 		thread_flags = READ_ONCE(current_thread_info()->flags);
 	} while (thread_flags & _TIF_WORK_MASK);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 05/20] arm64: entry.S: Remove disable_dbg
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

enable_step_tsk is the only user of disable_dbg, which doesn't respect
our 'dai' order for exception masking. enable_step_tsk may enable
single-step, so previously needed to mask debug exceptions to prevent us
from single-stepping kernel_exit. enable_step_tsk is called at the end
of the ret_to_user loop, which has already masked all exceptions so this
is no longer needed.

Remove disable_dbg, add a comment that enable_step_tsk's caller should
have masked debug.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 114e741ca873..1b0972046a72 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -68,13 +68,6 @@
 	msr	daif, \flags
 	.endm
 
-/*
- * Enable and disable debug exceptions.
- */
-	.macro	disable_dbg
-	msr	daifset, #8
-	.endm
-
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -88,9 +81,9 @@
 9990:
 	.endm
 
+	/* call with daif masked */
 	.macro	enable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
-	disable_dbg
 	mrs	\tmp, mdscr_el1
 	orr	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 05/20] arm64: entry.S: Remove disable_dbg
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

enable_step_tsk is the only user of disable_dbg, which doesn't respect
our 'dai' order for exception masking. enable_step_tsk may enable
single-step, so previously needed to mask debug exceptions to prevent us
from single-stepping kernel_exit. enable_step_tsk is called at the end
of the ret_to_user loop, which has already masked all exceptions so this
is no longer needed.

Remove disable_dbg, add a comment that enable_step_tsk's caller should
have masked debug.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 114e741ca873..1b0972046a72 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -68,13 +68,6 @@
 	msr	daif, \flags
 	.endm
 
-/*
- * Enable and disable debug exceptions.
- */
-	.macro	disable_dbg
-	msr	daifset, #8
-	.endm
-
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -88,9 +81,9 @@
 9990:
 	.endm
 
+	/* call with daif masked */
 	.macro	enable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
-	disable_dbg
 	mrs	\tmp, mdscr_el1
 	orr	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 06/20] arm64: entry.S: convert el1_sync
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interrupted context had
irqs unmasked.

Following our 'dai' order, el1_dbg should run with everything masked.
For the other cases we can inherit whatever we interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  6 ++++++
 arch/arm64/kernel/entry.S          | 12 ++++--------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 1b0972046a72..abb5abd61ddb 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -48,6 +48,12 @@
 	msr	daif, \flags
 	.endm
 
+	/* Only on aarch64 pstate, PSR_D_BIT is different for aarch32 */
+	.macro	inherit_daif, pstate:req, tmp:req
+	and	\tmp, \pstate, #(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+	msr	daif, \tmp
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f7d7bf9d76e7..bd54115972a4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -510,11 +510,7 @@ el1_da:
 	 * Data abort handling
 	 */
 	mrs	x3, far_el1
-	enable_dbg
-	// re-enable interrupts if they were enabled in the aborted context
-	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
-1:
+	inherit_daif	pstate=x23, tmp=x2
 	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
@@ -525,7 +521,7 @@ el1_sp_pc:
 	 * Stack or PC alignment exception handling
 	 */
 	mrs	x0, far_el1
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x2, sp
 	bl	do_sp_pc_abort
 	ASM_BUG()
@@ -533,7 +529,7 @@ el1_undef:
 	/*
 	 * Undefined instruction
 	 */
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x0, sp
 	bl	do_undefinstr
 	ASM_BUG()
@@ -550,7 +546,7 @@ el1_dbg:
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x0, sp
 	mov	x2, x1
 	mov	x1, #BAD_SYNC
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 06/20] arm64: entry.S: convert el1_sync
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interrupted context had
irqs unmasked.

Following our 'dai' order, el1_dbg should run with everything masked.
For the other cases we can inherit whatever we interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  6 ++++++
 arch/arm64/kernel/entry.S          | 12 ++++--------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 1b0972046a72..abb5abd61ddb 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -48,6 +48,12 @@
 	msr	daif, \flags
 	.endm
 
+	/* Only on aarch64 pstate, PSR_D_BIT is different for aarch32 */
+	.macro	inherit_daif, pstate:req, tmp:req
+	and	\tmp, \pstate, #(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+	msr	daif, \tmp
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f7d7bf9d76e7..bd54115972a4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -510,11 +510,7 @@ el1_da:
 	 * Data abort handling
 	 */
 	mrs	x3, far_el1
-	enable_dbg
-	// re-enable interrupts if they were enabled in the aborted context
-	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
-1:
+	inherit_daif	pstate=x23, tmp=x2
 	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
@@ -525,7 +521,7 @@ el1_sp_pc:
 	 * Stack or PC alignment exception handling
 	 */
 	mrs	x0, far_el1
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x2, sp
 	bl	do_sp_pc_abort
 	ASM_BUG()
@@ -533,7 +529,7 @@ el1_undef:
 	/*
 	 * Undefined instruction
 	 */
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x0, sp
 	bl	do_undefinstr
 	ASM_BUG()
@@ -550,7 +546,7 @@ el1_dbg:
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
-	enable_dbg
+	inherit_daif	pstate=x23, tmp=x2
 	mov	x0, sp
 	mov	x2, x1
 	mov	x1, #BAD_SYNC
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 07/20] arm64: entry.S convert el0_sync
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:17   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are enabled, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with everything masked to prevent us taking a debug
exception from do_debug_exception. For the other cases we can unmask
everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv
which previously ran with irqs masked.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  9 ---------
 arch/arm64/kernel/entry.S          | 24 ++++++++++--------------
 2 files changed, 10 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index abb5abd61ddb..c2a37e2f733c 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -97,15 +97,6 @@
 	.endm
 
 /*
- * Enable both debug exceptions and interrupts. This is likely to be
- * faster than two daifclr operations, since writes to this register
- * are self-synchronising.
- */
-	.macro	enable_dbg_and_irq
-	msr	daifclr, #(8 | 2)
-	.endm
-
-/*
  * SMP data memory barrier
  */
 	.macro	smp_dmb, opt
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index bd54115972a4..f7dfe5d2b1fb 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -670,8 +670,7 @@ el0_da:
 	 * Data abort handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	clear_address_tag x0, x26
 	mov	x1, x25
@@ -683,8 +682,7 @@ el0_ia:
 	 * Instruction abort handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -695,7 +693,7 @@ el0_fpsimd_acc:
 	/*
 	 * Floating Point or Advanced SIMD access
 	 */
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -705,7 +703,7 @@ el0_fpsimd_exc:
 	/*
 	 * Floating Point or Advanced SIMD exception
 	 */
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -716,8 +714,7 @@ el0_sp_pc:
 	 * Stack or PC alignment exception handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -728,8 +725,7 @@ el0_undef:
 	/*
 	 * Undefined instruction
 	 */
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
@@ -738,7 +734,7 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -753,11 +749,11 @@ el0_dbg:
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_debug_exception
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	b	ret_to_user
 el0_inv:
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, sp
 	mov	x1, #BAD_SYNC
@@ -836,7 +832,7 @@ el0_svc:
 	mov	wsc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, xscno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit 1
 
 	ldr	x16, [tsk, #TSK_TI_FLAGS]	// check for syscall hooks
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 07/20] arm64: entry.S convert el0_sync
@ 2017-10-05 19:17   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are enabled, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with everything masked to prevent us taking a debug
exception from do_debug_exception. For the other cases we can unmask
everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv
which previously ran with irqs masked.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  9 ---------
 arch/arm64/kernel/entry.S          | 24 ++++++++++--------------
 2 files changed, 10 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index abb5abd61ddb..c2a37e2f733c 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -97,15 +97,6 @@
 	.endm
 
 /*
- * Enable both debug exceptions and interrupts. This is likely to be
- * faster than two daifclr operations, since writes to this register
- * are self-synchronising.
- */
-	.macro	enable_dbg_and_irq
-	msr	daifclr, #(8 | 2)
-	.endm
-
-/*
  * SMP data memory barrier
  */
 	.macro	smp_dmb, opt
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index bd54115972a4..f7dfe5d2b1fb 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -670,8 +670,7 @@ el0_da:
 	 * Data abort handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	clear_address_tag x0, x26
 	mov	x1, x25
@@ -683,8 +682,7 @@ el0_ia:
 	 * Instruction abort handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -695,7 +693,7 @@ el0_fpsimd_acc:
 	/*
 	 * Floating Point or Advanced SIMD access
 	 */
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -705,7 +703,7 @@ el0_fpsimd_exc:
 	/*
 	 * Floating Point or Advanced SIMD exception
 	 */
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -716,8 +714,7 @@ el0_sp_pc:
 	 * Stack or PC alignment exception handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -728,8 +725,7 @@ el0_undef:
 	/*
 	 * Undefined instruction
 	 */
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
@@ -738,7 +734,7 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -753,11 +749,11 @@ el0_dbg:
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_debug_exception
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	b	ret_to_user
 el0_inv:
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	mov	x0, sp
 	mov	x1, #BAD_SYNC
@@ -836,7 +832,7 @@ el0_svc:
 	mov	wsc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, xscno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
+	enable_daif
 	ct_user_exit 1
 
 	ldr	x16, [tsk, #TSK_TI_FLAGS]	// check for syscall hooks
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 08/20] arm64: entry.S: convert elX_irq
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Following our 'dai' order, irqs should be processed with debug and
serror exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 4 ++++
 arch/arm64/kernel/entry.S          | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c2a37e2f733c..7ffb2a629dc9 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -54,6 +54,10 @@
 	msr	daif, \tmp
 	.endm
 
+	.macro enable_da_f
+	msr	daifclr, #(8 | 4 | 1)
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f7dfe5d2b1fb..df085ec003b0 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -557,7 +557,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
@@ -766,7 +766,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 08/20] arm64: entry.S: convert elX_irq
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

Following our 'dai' order, irqs should be processed with debug and
serror exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 4 ++++
 arch/arm64/kernel/entry.S          | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c2a37e2f733c..7ffb2a629dc9 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -54,6 +54,10 @@
 	msr	daif, \tmp
 	.endm
 
+	.macro enable_da_f
+	msr	daifclr, #(8 | 4 | 1)
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f7dfe5d2b1fb..df085ec003b0 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -557,7 +557,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
@@ -766,7 +766,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.

Signed-off-by: James Morse <james.morse@arm.com>

---
Give me a kick if you want this reworked as a fix (which will then
conflict with this series), or a backportable version.

 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 virt/kvm/arm/arm.c                |  4 ++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..d3eb79a9ed6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmio.h>
@@ -384,4 +385,13 @@ static inline void __cpu_init_stage2(void)
 		  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+static inline void kvm_arm_vhe_guest_enter(void)
+{
+	local_mask_daif();
+}
+
+static inline void kvm_arm_vhe_guest_exit(void)
+{
+	local_restore_daif(DAIF_PROCCTX_NOIRQ);
+}
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index b9f68e4add71..665529924b34 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_enter_irqoff();
+		if (has_vhe())
+			kvm_arm_vhe_guest_enter();
 
 		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
+		if (has_vhe())
+			kvm_arm_vhe_guest_exit();
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
 		/*
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.

Signed-off-by: James Morse <james.morse@arm.com>

---
Give me a kick if you want this reworked as a fix (which will then
conflict with this series), or a backportable version.

 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 virt/kvm/arm/arm.c                |  4 ++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..d3eb79a9ed6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmio.h>
@@ -384,4 +385,13 @@ static inline void __cpu_init_stage2(void)
 		  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+static inline void kvm_arm_vhe_guest_enter(void)
+{
+	local_mask_daif();
+}
+
+static inline void kvm_arm_vhe_guest_exit(void)
+{
+	local_restore_daif(DAIF_PROCCTX_NOIRQ);
+}
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index b9f68e4add71..665529924b34 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_enter_irqoff();
+		if (has_vhe())
+			kvm_arm_vhe_guest_enter();
 
 		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
+		if (has_vhe())
+			kvm_arm_vhe_guest_exit();
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
 		/*
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 10/20] arm64: entry.S: move SError handling into a C function for future expansion
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, Wang Xiongfeng, kvmarm

From: Xie XiuQi <xiexiuqi@huawei.com>

Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers and unmask debug exceptions. Future patches may change
do_serror() to return if the SError Interrupt was notification of a
corrected error.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Wang Xiongfeng <wangxiongfengi2@huawei.com>
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig        |  2 +-
 arch/arm64/kernel/entry.S | 36 +++++++++++++++++++++++++++++-------
 arch/arm64/kernel/traps.c | 13 +++++++++++++
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6a56d4..70dfe4e9ccc5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -98,7 +98,7 @@ config ARM64
 	select HAVE_IRQ_TIME_ACCOUNTING
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
-	select HAVE_NMI if ACPI_APEI_SEA
+	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index df085ec003b0..e147c1d00b41 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -375,18 +375,18 @@ ENTRY(vectors)
 	kernel_ventry	el1_sync			// Synchronous EL1h
 	kernel_ventry	el1_irq				// IRQ EL1h
 	kernel_ventry	el1_fiq_invalid			// FIQ EL1h
-	kernel_ventry	el1_error_invalid		// Error EL1h
+	kernel_ventry	el1_error			// Error EL1h
 
 	kernel_ventry	el0_sync			// Synchronous 64-bit EL0
 	kernel_ventry	el0_irq				// IRQ 64-bit EL0
 	kernel_ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	kernel_ventry	el0_error_invalid		// Error 64-bit EL0
+	kernel_ventry	el0_error			// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
 	kernel_ventry	el0_sync_compat			// Synchronous 32-bit EL0
 	kernel_ventry	el0_irq_compat			// IRQ 32-bit EL0
 	kernel_ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	kernel_ventry	el0_error_invalid_compat	// Error 32-bit EL0
+	kernel_ventry	el0_error_compat		// Error 32-bit EL0
 #else
 	kernel_ventry	el0_sync_invalid		// Synchronous 32-bit EL0
 	kernel_ventry	el0_irq_invalid			// IRQ 32-bit EL0
@@ -455,10 +455,6 @@ ENDPROC(el0_error_invalid)
 el0_fiq_invalid_compat:
 	inv_entry 0, BAD_FIQ, 32
 ENDPROC(el0_fiq_invalid_compat)
-
-el0_error_invalid_compat:
-	inv_entry 0, BAD_ERROR, 32
-ENDPROC(el0_error_invalid_compat)
 #endif
 
 el1_sync_invalid:
@@ -663,6 +659,10 @@ el0_svc_compat:
 el0_irq_compat:
 	kernel_entry 0, 32
 	b	el0_irq_naked
+
+el0_error_compat:
+	kernel_entry 0, 32
+	b	el0_error_naked
 #endif
 
 el0_da:
@@ -780,6 +780,28 @@ el0_irq_naked:
 	b	ret_to_user
 ENDPROC(el0_irq)
 
+el1_error:
+	kernel_entry 1
+	mrs	x1, esr_el1
+	enable_dbg
+	mov	x0, sp
+	bl	do_serror
+	kernel_exit 1
+ENDPROC(el1_error)
+
+el0_error:
+	kernel_entry 0
+el0_error_naked:
+	mrs	x1, esr_el1
+	enable_dbg
+	mov	x0, sp
+	bl	do_serror
+	enable_daif
+	ct_user_exit
+	b	ret_to_user
+ENDPROC(el0_error)
+
+
 /*
  * This is the fast syscall return path.  We do as little as possible here,
  * and this includes saving x0 back into the kernel stack.
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index d046495f67eb..210647d33bea 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -709,6 +709,19 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	console_verbose();
+
+	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
+		smp_processor_id(), esr, esr_get_class_string(esr));
+	__show_regs(regs);
+
+	panic("Asynchronous SError Interrupt");
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 10/20] arm64: entry.S: move SError handling into a C function for future expansion
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Xie XiuQi <xiexiuqi@huawei.com>

Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers and unmask debug exceptions. Future patches may change
do_serror() to return if the SError Interrupt was notification of a
corrected error.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Wang Xiongfeng <wangxiongfengi2@huawei.com>
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig        |  2 +-
 arch/arm64/kernel/entry.S | 36 +++++++++++++++++++++++++++++-------
 arch/arm64/kernel/traps.c | 13 +++++++++++++
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6a56d4..70dfe4e9ccc5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -98,7 +98,7 @@ config ARM64
 	select HAVE_IRQ_TIME_ACCOUNTING
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
-	select HAVE_NMI if ACPI_APEI_SEA
+	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index df085ec003b0..e147c1d00b41 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -375,18 +375,18 @@ ENTRY(vectors)
 	kernel_ventry	el1_sync			// Synchronous EL1h
 	kernel_ventry	el1_irq				// IRQ EL1h
 	kernel_ventry	el1_fiq_invalid			// FIQ EL1h
-	kernel_ventry	el1_error_invalid		// Error EL1h
+	kernel_ventry	el1_error			// Error EL1h
 
 	kernel_ventry	el0_sync			// Synchronous 64-bit EL0
 	kernel_ventry	el0_irq				// IRQ 64-bit EL0
 	kernel_ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	kernel_ventry	el0_error_invalid		// Error 64-bit EL0
+	kernel_ventry	el0_error			// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
 	kernel_ventry	el0_sync_compat			// Synchronous 32-bit EL0
 	kernel_ventry	el0_irq_compat			// IRQ 32-bit EL0
 	kernel_ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	kernel_ventry	el0_error_invalid_compat	// Error 32-bit EL0
+	kernel_ventry	el0_error_compat		// Error 32-bit EL0
 #else
 	kernel_ventry	el0_sync_invalid		// Synchronous 32-bit EL0
 	kernel_ventry	el0_irq_invalid			// IRQ 32-bit EL0
@@ -455,10 +455,6 @@ ENDPROC(el0_error_invalid)
 el0_fiq_invalid_compat:
 	inv_entry 0, BAD_FIQ, 32
 ENDPROC(el0_fiq_invalid_compat)
-
-el0_error_invalid_compat:
-	inv_entry 0, BAD_ERROR, 32
-ENDPROC(el0_error_invalid_compat)
 #endif
 
 el1_sync_invalid:
@@ -663,6 +659,10 @@ el0_svc_compat:
 el0_irq_compat:
 	kernel_entry 0, 32
 	b	el0_irq_naked
+
+el0_error_compat:
+	kernel_entry 0, 32
+	b	el0_error_naked
 #endif
 
 el0_da:
@@ -780,6 +780,28 @@ el0_irq_naked:
 	b	ret_to_user
 ENDPROC(el0_irq)
 
+el1_error:
+	kernel_entry 1
+	mrs	x1, esr_el1
+	enable_dbg
+	mov	x0, sp
+	bl	do_serror
+	kernel_exit 1
+ENDPROC(el1_error)
+
+el0_error:
+	kernel_entry 0
+el0_error_naked:
+	mrs	x1, esr_el1
+	enable_dbg
+	mov	x0, sp
+	bl	do_serror
+	enable_daif
+	ct_user_exit
+	b	ret_to_user
+ENDPROC(el0_error)
+
+
 /*
  * This is the fast syscall return path.  We do as little as possible here,
  * and this includes saving x0 back into the kernel stack.
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index d046495f67eb..210647d33bea 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -709,6 +709,19 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	console_verbose();
+
+	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
+		smp_processor_id(), esr, esr_get_class_string(esr));
+	__show_regs(regs);
+
+	panic("Asynchronous SError Interrupt");
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 11/20] arm64: cpufeature: Detect CPU RAS Extentions
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extensions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased, added esb and config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 arch/arm64/kernel/process.c      |  3 +++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 70dfe4e9ccc5..b68f5e93baac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -973,6 +973,22 @@ config ARM64_PMEM
 	  operations if DC CVAP is not supported (following the behaviour of
 	  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 0fe7e43b7fbc..8b0a0eb67625 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -30,6 +30,7 @@
 #define isb()		asm volatile("isb" : : : "memory")
 #define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
+#define esb()		asm volatile("hint #16"  : : : "memory")
 
 #define mb()		dsb(sy)
 #define rmb()		dsb(ld)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da621627d7c..4820d441bfb9 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
+#define ARM64_HAS_RAS_EXTN			22
 
-#define ARM64_NCAPS				22
+#define ARM64_NCAPS				23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed5886f..64e2a80fd749 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -332,6 +332,7 @@
 #define ID_AA64ISAR1_DPB_SHIFT		0
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -340,6 +341,7 @@
 #define ID_AA64PFR0_EL1_SHIFT		4
 #define ID_AA64PFR0_EL0_SHIFT		0
 
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index cd52d365d1f0..0fc017b55cb1 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -125,6 +125,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -900,6 +901,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2dc0f8482210..5e5d2f0a1d0a 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -365,6 +365,9 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/* Deliver any pending SError from prev */
+	esb();
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 11/20] arm64: cpufeature: Detect CPU RAS Extentions
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extensions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased, added esb and config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 arch/arm64/kernel/process.c      |  3 +++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 70dfe4e9ccc5..b68f5e93baac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -973,6 +973,22 @@ config ARM64_PMEM
 	  operations if DC CVAP is not supported (following the behaviour of
 	  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 0fe7e43b7fbc..8b0a0eb67625 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -30,6 +30,7 @@
 #define isb()		asm volatile("isb" : : : "memory")
 #define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
+#define esb()		asm volatile("hint #16"  : : : "memory")
 
 #define mb()		dsb(sy)
 #define rmb()		dsb(ld)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da621627d7c..4820d441bfb9 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
+#define ARM64_HAS_RAS_EXTN			22
 
-#define ARM64_NCAPS				22
+#define ARM64_NCAPS				23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed5886f..64e2a80fd749 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -332,6 +332,7 @@
 #define ID_AA64ISAR1_DPB_SHIFT		0
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -340,6 +341,7 @@
 #define ID_AA64PFR0_EL1_SHIFT		4
 #define ID_AA64PFR0_EL0_SHIFT		0
 
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index cd52d365d1f0..0fc017b55cb1 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -125,6 +125,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -900,6 +901,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2dc0f8482210..5e5d2f0a1d0a 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -365,6 +365,9 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/* Deliver any pending SError from prev */
+	esb();
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 12/20] arm64: kernel: Survive corrected RAS errors notified by SError
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the ESB instruction.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as a
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error. Future patches may add this kernel-first handling.

Without support for either of these we will panic(), even if we received
a corrected error. Add code to decode the severity of RAS errors. We can
safely ignore contained errors where the CPU can continue to make
progress. For all other errors we continue to panic().

Signed-off-by: James Morse <james.morse@arm.com>

==
I couldn't come up with a concise way to capture 'can continue to make
progress', so opted for 'blocking' instead.
---
 arch/arm64/include/asm/esr.h   | 10 ++++++++
 arch/arm64/include/asm/traps.h | 36 ++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c      | 58 ++++++++++++++++++++++++++++++++++++++----
 3 files changed, 99 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 66ed8b6b9976..8ea52f15bf1c 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -85,6 +85,15 @@
 #define ESR_ELx_WNR_SHIFT	(6)
 #define ESR_ELx_WNR		(UL(1) << ESR_ELx_WNR_SHIFT)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_AET		(UL(0x7) << 10)
+
+#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
+#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
+#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
+#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
+#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_SET_SHIFT	(11)
 #define ESR_ELx_SET_MASK	(UL(3) << ESR_ELx_SET_SHIFT)
@@ -99,6 +108,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index d131501c6222..8d2a1fff5c6b 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -19,6 +19,7 @@
 #define __ASM_TRAP_H
 
 #include <linux/list.h>
+#include <asm/esr.h>
 #include <asm/sections.h>
 
 struct pt_regs;
@@ -58,4 +59,39 @@ static inline int in_entry_text(unsigned long ptr)
 	return ptr >= (unsigned long)&__entry_text_start &&
 	       ptr < (unsigned long)&__entry_text_end;
 }
+
+static inline bool arm64_is_ras_serror(u32 esr)
+{
+	bool impdef = esr & ESR_ELx_ISV; /* aka IDS */
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		return !impdef;
+
+	return false;
+}
+
+/* Return the AET bits of an SError ESR, or 0/uncontainable/uncategorized */
+static inline u32 arm64_ras_serror_get_severity(u32 esr)
+{
+	u32 aet = esr & ESR_ELx_AET;
+
+	if (!arm64_is_ras_serror(esr)) {
+		/* Not a RAS error, we can't interpret the ESR */
+		return 0;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		/* No severity information */
+		return 0;
+	}
+
+	return aet;
+}
+
+bool arm64_blocking_ras_serror(struct pt_regs *regs, unsigned int esr);
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr);
 #endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 210647d33bea..ab0f61d98fd6 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -709,17 +709,65 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
 		smp_processor_id(), esr, esr_get_class_string(esr));
-	__show_regs(regs);
+	if (regs)
+		__show_regs(regs);
+
+	/* KVM may call this this from a preemptible context */
+	preempt_disable();
+
+	/*
+	 * panic() unmasks interrupts, which unmasks SError. Use nmi_panic()
+	 * to avoid re-entering panic.
+	 */
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	cpu_park_loop();
+	unreachable();
+}
+
+bool arm64_blocking_ras_serror(struct pt_regs *regs, unsigned int esr)
+{
+	u32 aet = arm64_ras_serror_get_severity(esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		/*
+		 * The CPU can make progress. We may take UEO again as
+		 * a more severe error.
+		 */
+		return false;
+
+	case ESR_ELx_AET_UEU:	/* Uncorrected Unrecoverable */
+	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
+		/*
+		 * The CPU can't make progress. The exception may have
+		 * been imprecise.
+		 */
+		return true;
+
+	case ESR_ELx_AET_UC:	/* Uncontainable error */
+	default:
+		/* Error has been silently propagated */
+		arm64_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	/* non-RAS errors are not containable */
+	if (!arm64_is_ras_serror(esr) || arm64_blocking_ras_serror(regs, esr))
+		arm64_serror_panic(regs, esr);
 
-	panic("Asynchronous SError Interrupt");
+	nmi_exit();
 }
 
 void __pte_error(const char *file, int line, unsigned long val)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 12/20] arm64: kernel: Survive corrected RAS errors notified by SError
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the ESB instruction.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as a
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error. Future patches may add this kernel-first handling.

Without support for either of these we will panic(), even if we received
a corrected error. Add code to decode the severity of RAS errors. We can
safely ignore contained errors where the CPU can continue to make
progress. For all other errors we continue to panic().

Signed-off-by: James Morse <james.morse@arm.com>

==
I couldn't come up with a concise way to capture 'can continue to make
progress', so opted for 'blocking' instead.
---
 arch/arm64/include/asm/esr.h   | 10 ++++++++
 arch/arm64/include/asm/traps.h | 36 ++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c      | 58 ++++++++++++++++++++++++++++++++++++++----
 3 files changed, 99 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 66ed8b6b9976..8ea52f15bf1c 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -85,6 +85,15 @@
 #define ESR_ELx_WNR_SHIFT	(6)
 #define ESR_ELx_WNR		(UL(1) << ESR_ELx_WNR_SHIFT)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_AET		(UL(0x7) << 10)
+
+#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
+#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
+#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
+#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
+#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_SET_SHIFT	(11)
 #define ESR_ELx_SET_MASK	(UL(3) << ESR_ELx_SET_SHIFT)
@@ -99,6 +108,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index d131501c6222..8d2a1fff5c6b 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -19,6 +19,7 @@
 #define __ASM_TRAP_H
 
 #include <linux/list.h>
+#include <asm/esr.h>
 #include <asm/sections.h>
 
 struct pt_regs;
@@ -58,4 +59,39 @@ static inline int in_entry_text(unsigned long ptr)
 	return ptr >= (unsigned long)&__entry_text_start &&
 	       ptr < (unsigned long)&__entry_text_end;
 }
+
+static inline bool arm64_is_ras_serror(u32 esr)
+{
+	bool impdef = esr & ESR_ELx_ISV; /* aka IDS */
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		return !impdef;
+
+	return false;
+}
+
+/* Return the AET bits of an SError ESR, or 0/uncontainable/uncategorized */
+static inline u32 arm64_ras_serror_get_severity(u32 esr)
+{
+	u32 aet = esr & ESR_ELx_AET;
+
+	if (!arm64_is_ras_serror(esr)) {
+		/* Not a RAS error, we can't interpret the ESR */
+		return 0;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		/* No severity information */
+		return 0;
+	}
+
+	return aet;
+}
+
+bool arm64_blocking_ras_serror(struct pt_regs *regs, unsigned int esr);
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr);
 #endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 210647d33bea..ab0f61d98fd6 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -709,17 +709,65 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
 		smp_processor_id(), esr, esr_get_class_string(esr));
-	__show_regs(regs);
+	if (regs)
+		__show_regs(regs);
+
+	/* KVM may call this this from a preemptible context */
+	preempt_disable();
+
+	/*
+	 * panic() unmasks interrupts, which unmasks SError. Use nmi_panic()
+	 * to avoid re-entering panic.
+	 */
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	cpu_park_loop();
+	unreachable();
+}
+
+bool arm64_blocking_ras_serror(struct pt_regs *regs, unsigned int esr)
+{
+	u32 aet = arm64_ras_serror_get_severity(esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		/*
+		 * The CPU can make progress. We may take UEO again as
+		 * a more severe error.
+		 */
+		return false;
+
+	case ESR_ELx_AET_UEU:	/* Uncorrected Unrecoverable */
+	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
+		/*
+		 * The CPU can't make progress. The exception may have
+		 * been imprecise.
+		 */
+		return true;
+
+	case ESR_ELx_AET_UC:	/* Uncontainable error */
+	default:
+		/* Error has been silently propagated */
+		arm64_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	/* non-RAS errors are not containable */
+	if (!arm64_is_ras_serror(esr) || arm64_blocking_ras_serror(regs, esr))
+		arm64_serror_panic(regs, esr);
 
-	panic("Asynchronous SError Interrupt");
+	nmi_exit();
 }
 
 void __pte_error(const char *file, int line, unsigned long val)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

ARM v8.2 has a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from an exception level. Add code to
detect this feature and enable the SCTLR_ELx.IESB bit.

This feature causes RAS errors that are not yet visible to software to
become pending SErrors. We expect to have firmware-first RAS support
so synchronised RAS errors will be take immediately to EL3.
Any system without firmware-first handling of errors will take the SError
either immediately after exception return, or when we unmask SError after
entry.S's work.

Platform level RAS support may require additional firmware support.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>

---
Note the sneaky KVM change,

 arch/arm64/Kconfig                 | 15 +++++++++++++++
 arch/arm64/include/asm/cpucaps.h   |  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     | 21 +++++++++++++++++++++
 arch/arm64/kvm/hyp-init.S          |  3 +++
 6 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b68f5e93baac..29df2a93688c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
 	  and access the new registers if the system supports the extension.
 	  Platform RAS features may additionally depend on firmware support.
 
+config ARM64_IESB
+	bool "Enable Implicit Error Synchronization Barrier (IESB)"
+	default y
+	depends on ARM64_RAS_EXTN
+	help
+	  ARM v8.2 adds a feature to add implicit error synchronization
+	  barriers whenever the CPU enters or exits a particular exception
+	  level.
+
+	  On CPUs with this feature and the 'RAS Extensions' feature, we can
+	  use this to contain detected (but not yet reported) errors to the
+	  relevant exception level.
+
+	  The feature is detected at runtime, selecting this option will
+	  enable these implicit barriers if the CPU supports the feature.
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 4820d441bfb9..7a2bbbfdff49 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -41,7 +41,8 @@
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
 #define ARM64_HAS_RAS_EXTN			22
+#define ARM64_HAS_IESB				23
 
-#define ARM64_NCAPS				23
+#define ARM64_NCAPS				24
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 29adab8138c3..6b72ddc33d06 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -193,5 +193,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_enable_iesb(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 64e2a80fd749..4500a70c6a57 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -297,6 +297,7 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_IESB	(1 << 21)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
 #define SCTLR_ELx_C	(1 << 2)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0fc017b55cb1..626539995316 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -912,6 +912,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
 	},
+#ifdef CONFIG_ARM64_IESB
+	{
+		.desc = "Implicit Error Synchronization Barrier",
+		.capability = ARM64_HAS_IESB,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_IESB_SHIFT,
+		.min_field_value = 1,
+		.enable = cpu_enable_iesb,
+	},
+#endif /* CONFIG_ARM64_IESB */
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
@@ -1321,3 +1334,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 late_initcall(enable_mrs_emulation);
+
+int cpu_enable_iesb(void *__unused)
+{
+	if (cpus_have_cap(ARM64_HAS_RAS_EXTN))
+		config_sctlr_el1(0, SCTLR_ELx_IESB);
+
+	return 0;
+}
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 3f9615582377..8983e9473017 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -113,6 +113,9 @@ __do_hyp_init:
 	 */
 	ldr	x4, =(SCTLR_EL2_RES1 | (SCTLR_ELx_FLAGS & ~SCTLR_ELx_A))
 CPU_BE(	orr	x4, x4, #SCTLR_ELx_EE)
+alternative_if ARM64_HAS_IESB
+	orr	x4, x4, #SCTLR_ELx_IESB
+alternative_else_nop_endif
 	msr	sctlr_el2, x4
 	isb
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

ARM v8.2 has a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from an exception level. Add code to
detect this feature and enable the SCTLR_ELx.IESB bit.

This feature causes RAS errors that are not yet visible to software to
become pending SErrors. We expect to have firmware-first RAS support
so synchronised RAS errors will be take immediately to EL3.
Any system without firmware-first handling of errors will take the SError
either immediately after exception return, or when we unmask SError after
entry.S's work.

Platform level RAS support may require additional firmware support.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>

---
Note the sneaky KVM change,

 arch/arm64/Kconfig                 | 15 +++++++++++++++
 arch/arm64/include/asm/cpucaps.h   |  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     | 21 +++++++++++++++++++++
 arch/arm64/kvm/hyp-init.S          |  3 +++
 6 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b68f5e93baac..29df2a93688c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
 	  and access the new registers if the system supports the extension.
 	  Platform RAS features may additionally depend on firmware support.
 
+config ARM64_IESB
+	bool "Enable Implicit Error Synchronization Barrier (IESB)"
+	default y
+	depends on ARM64_RAS_EXTN
+	help
+	  ARM v8.2 adds a feature to add implicit error synchronization
+	  barriers whenever the CPU enters or exits a particular exception
+	  level.
+
+	  On CPUs with this feature and the 'RAS Extensions' feature, we can
+	  use this to contain detected (but not yet reported) errors to the
+	  relevant exception level.
+
+	  The feature is detected at runtime, selecting this option will
+	  enable these implicit barriers if the CPU supports the feature.
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 4820d441bfb9..7a2bbbfdff49 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -41,7 +41,8 @@
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
 #define ARM64_HAS_RAS_EXTN			22
+#define ARM64_HAS_IESB				23
 
-#define ARM64_NCAPS				23
+#define ARM64_NCAPS				24
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 29adab8138c3..6b72ddc33d06 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -193,5 +193,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_enable_iesb(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 64e2a80fd749..4500a70c6a57 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -297,6 +297,7 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_IESB	(1 << 21)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
 #define SCTLR_ELx_C	(1 << 2)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0fc017b55cb1..626539995316 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -912,6 +912,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
 	},
+#ifdef CONFIG_ARM64_IESB
+	{
+		.desc = "Implicit Error Synchronization Barrier",
+		.capability = ARM64_HAS_IESB,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_IESB_SHIFT,
+		.min_field_value = 1,
+		.enable = cpu_enable_iesb,
+	},
+#endif /* CONFIG_ARM64_IESB */
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
@@ -1321,3 +1334,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 late_initcall(enable_mrs_emulation);
+
+int cpu_enable_iesb(void *__unused)
+{
+	if (cpus_have_cap(ARM64_HAS_RAS_EXTN))
+		config_sctlr_el1(0, SCTLR_ELx_IESB);
+
+	return 0;
+}
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 3f9615582377..8983e9473017 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -113,6 +113,9 @@ __do_hyp_init:
 	 */
 	ldr	x4, =(SCTLR_EL2_RES1 | (SCTLR_ELx_FLAGS & ~SCTLR_ELx_A))
 CPU_BE(	orr	x4, x4, #SCTLR_ELx_EE)
+alternative_if ARM64_HAS_IESB
+	orr	x4, x4, #SCTLR_ELx_IESB
+alternative_else_nop_endif
 	msr	sctlr_el2, x4
 	isb
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 14/20] arm64: kernel: Prepare for a DISR user
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

KVM would like to consume any pending SError (or RAS error) after guest
exit. Today it has to unmask SError and use dsb+isb to synchronise the
CPU. With the RAS extensions we can use ESB to synchronise any pending
SError.

Add the necessary macros to allow DISR to be read and converted to an
ESR.

We clear the DISR register when we enable the RAS cpufeature. While the
kernel has issued ESB instructions before this point it has only done so
with SError unmasked, any value we find in DISR must have belonged to
firmware. Executing an ESB instruction is the only way to update DISR,
so we can expect firmware to have handled any deferred SError. By the
same logic we clear DISR in the idle path.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  7 +++++++
 arch/arm64/include/asm/esr.h       |  7 +++++++
 arch/arm64/include/asm/exception.h | 14 ++++++++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++++++
 arch/arm64/mm/proc.S               |  5 +++++
 7 files changed, 44 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 7ffb2a629dc9..d4c0adf792ee 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -108,6 +108,13 @@
 	.endm
 
 /*
+ * RAS Error Synchronization barrier
+ */
+	.macro  esb
+	hint    #16
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 8ea52f15bf1c..b3f17fd18b14 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -136,6 +136,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		(UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 6b72ddc33d06..9de3f839be8a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -194,5 +194,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
 int cpu_enable_iesb(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 4500a70c6a57..427c36bc5dd6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -179,6 +179,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 626539995316..da8fd37c0984 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -911,6 +911,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #ifdef CONFIG_ARM64_IESB
 	{
@@ -1342,3 +1343,11 @@ int cpu_enable_iesb(void *__unused)
 
 	return 0;
 }
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..ac571223672d 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -124,6 +124,11 @@ ENTRY(cpu_do_resume)
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+
+alternative_if ARM64_HAS_RAS_EXTN
+	msr_s	SYS_DISR_EL1, xzr
+alternative_else_nop_endif
+
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 14/20] arm64: kernel: Prepare for a DISR user
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

KVM would like to consume any pending SError (or RAS error) after guest
exit. Today it has to unmask SError and use dsb+isb to synchronise the
CPU. With the RAS extensions we can use ESB to synchronise any pending
SError.

Add the necessary macros to allow DISR to be read and converted to an
ESR.

We clear the DISR register when we enable the RAS cpufeature. While the
kernel has issued ESB instructions before this point it has only done so
with SError unmasked, any value we find in DISR must have belonged to
firmware. Executing an ESB instruction is the only way to update DISR,
so we can expect firmware to have handled any deferred SError. By the
same logic we clear DISR in the idle path.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  7 +++++++
 arch/arm64/include/asm/esr.h       |  7 +++++++
 arch/arm64/include/asm/exception.h | 14 ++++++++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++++++
 arch/arm64/mm/proc.S               |  5 +++++
 7 files changed, 44 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 7ffb2a629dc9..d4c0adf792ee 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -108,6 +108,13 @@
 	.endm
 
 /*
+ * RAS Error Synchronization barrier
+ */
+	.macro  esb
+	hint    #16
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 8ea52f15bf1c..b3f17fd18b14 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -136,6 +136,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		(UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 6b72ddc33d06..9de3f839be8a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -194,5 +194,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
 int cpu_enable_iesb(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 4500a70c6a57..427c36bc5dd6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -179,6 +179,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 626539995316..da8fd37c0984 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -911,6 +911,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #ifdef CONFIG_ARM64_IESB
 	{
@@ -1342,3 +1343,11 @@ int cpu_enable_iesb(void *__unused)
 
 	return 0;
 }
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..ac571223672d 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -124,6 +124,11 @@ ENTRY(cpu_do_resume)
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+
+alternative_if ARM64_HAS_RAS_EXTN
+	msr_s	SYS_DISR_EL1, xzr
+alternative_else_nop_endif
+
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, Dongjiu Geng, kvmarm

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __deactivate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  3 +++
 arch/arm64/include/asm/sysreg.h      |  1 +
 arch/arm64/kvm/hyp/switch.c          |  4 ++++
 arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
 5 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fce0008..8a7a838eb17a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d3eb79a9ed6b..0af35e71fedb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 427c36bc5dd6..a493e93de296 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -253,6 +253,7 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..af37658223a0 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 		isb();
 	}
 	write_sysreg(val, hcr_el2);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index da6a8cfa54a0..52f7f66f1356 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __deactivate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  3 +++
 arch/arm64/include/asm/sysreg.h      |  1 +
 arch/arm64/kvm/hyp/switch.c          |  4 ++++
 arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
 5 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fce0008..8a7a838eb17a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d3eb79a9ed6b..0af35e71fedb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 427c36bc5dd6..a493e93de296 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -253,6 +253,7 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..af37658223a0 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 		isb();
 	}
 	write_sysreg(val, hcr_el2);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index da6a8cfa54a0..52f7f66f1356 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 16/20] KVM: arm64: Save/Restore guest DISR_EL1
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 arch/arm64/include/asm/sysreg.h   | 1 +
 arch/arm64/kvm/hyp/sysreg-sr.c    | 6 ++++++
 arch/arm64/kvm/sys_regs.c         | 1 +
 4 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0af35e71fedb..9564e6d56b79 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -120,6 +120,7 @@ enum vcpu_sysreg {
 	PAR_EL1,	/* Physical Address Register */
 	MDSCR_EL1,	/* Monitor Debug System Control Register */
 	MDCCINT_EL1,	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+	DISR_EL1,	/* Deferred Interrupt Status Register */
 
 	/* Performance Monitors Registers */
 	PMCR_EL0,	/* Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a493e93de296..1b8b9012234d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -256,6 +256,7 @@
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
+#define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
 #define SYS_ICH_AP0R0_EL2		__SYS__AP0Rx_EL2(0)
 #define SYS_ICH_AP0R1_EL2		__SYS__AP0Rx_EL2(1)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 934137647837..f4d604803b29 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -66,6 +66,9 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
 	ctxt->gp_regs.sp_el1		= read_sysreg(sp_el1);
 	ctxt->gp_regs.elr_el1		= read_sysreg_el1(elr);
 	ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
@@ -119,6 +122,9 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
 	write_sysreg(ctxt->gp_regs.sp_el1,		sp_el1);
 	write_sysreg_el1(ctxt->gp_regs.elr_el1,		elr);
 	write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e070d3baf9f..713275b501ce 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -963,6 +963,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
+	{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
 
 	{ SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only },
 	{ SYS_DESC(SYS_ICC_EOIR0_EL1), read_from_write_only },
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 16/20] KVM: arm64: Save/Restore guest DISR_EL1
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 arch/arm64/include/asm/sysreg.h   | 1 +
 arch/arm64/kvm/hyp/sysreg-sr.c    | 6 ++++++
 arch/arm64/kvm/sys_regs.c         | 1 +
 4 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0af35e71fedb..9564e6d56b79 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -120,6 +120,7 @@ enum vcpu_sysreg {
 	PAR_EL1,	/* Physical Address Register */
 	MDSCR_EL1,	/* Monitor Debug System Control Register */
 	MDCCINT_EL1,	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+	DISR_EL1,	/* Deferred Interrupt Status Register */
 
 	/* Performance Monitors Registers */
 	PMCR_EL0,	/* Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a493e93de296..1b8b9012234d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -256,6 +256,7 @@
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
+#define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
 #define SYS_ICH_AP0R0_EL2		__SYS__AP0Rx_EL2(0)
 #define SYS_ICH_AP0R1_EL2		__SYS__AP0Rx_EL2(1)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 934137647837..f4d604803b29 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -66,6 +66,9 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
 	ctxt->gp_regs.sp_el1		= read_sysreg(sp_el1);
 	ctxt->gp_regs.elr_el1		= read_sysreg_el1(elr);
 	ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
@@ -119,6 +122,9 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
 	write_sysreg(ctxt->gp_regs.sp_el1,		sp_el1);
 	write_sysreg_el1(ctxt->gp_regs.elr_el1,		elr);
 	write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e070d3baf9f..713275b501ce 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -963,6 +963,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
+	{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
 
 	{ SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only },
 	{ SYS_DESC(SYS_ICC_EOIR0_EL1), read_from_write_only },
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 17/20] KVM: arm64: Save ESR_EL2 on guest SError
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index af37658223a0..cba6d8ac105c 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -230,13 +230,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	return true;
 }
 
+static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
+}
+
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	__populate_fault_info_esr(vcpu);
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -325,6 +332,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
+	else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
+		__populate_fault_info_esr(vcpu);
 
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 17/20] KVM: arm64: Save ESR_EL2 on guest SError
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index af37658223a0..cba6d8ac105c 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -230,13 +230,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	return true;
 }
 
+static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
+}
+
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	__populate_fault_info_esr(vcpu);
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -325,6 +332,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
+	else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
+		__populate_fault_info_esr(vcpu);
 
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 18/20] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained errors
arm64_is_blocking_ras_serror() will panic(), these errors compromise
the host too. All other error types are contained: For the 'blocking'
errors the vCPU can't make progress, so we inject a virtual SError.
We ignore contained errors where we can make progress as if we're lucky,
we may not hit them again.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/handle_exit.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74843a0..345fdbba6c2e 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,12 +28,19 @@
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_psci.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
+static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	if (!arm64_is_ras_serror(esr) || arm64_blocking_ras_serror(NULL, esr))
+		kvm_inject_vabt(vcpu);
+}
+
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret;
@@ -211,7 +218,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
 		return 1;
 	case ARM_EXCEPTION_TRAP:
 		/*
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 18/20] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained errors
arm64_is_blocking_ras_serror() will panic(), these errors compromise
the host too. All other error types are contained: For the 'blocking'
errors the vCPU can't make progress, so we inject a virtual SError.
We ignore contained errors where we can make progress as if we're lucky,
we may not hit them again.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/handle_exit.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74843a0..345fdbba6c2e 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,12 +28,19 @@
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_psci.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
+static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	if (!arm64_is_ras_serror(esr) || arm64_blocking_ras_serror(NULL, esr))
+		kvm_inject_vabt(vcpu);
+}
+
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret;
@@ -211,7 +218,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
 		return 1;
 	case ARM_EXCEPTION_TRAP:
 		/*
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 10 +++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
 5 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 8a7a838eb17a..8274d16df3cd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -173,6 +173,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9564e6d56b79..b3c53d8b6a60 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -89,6 +89,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088f1e4b..121889c49542 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -130,6 +130,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 345fdbba6c2e..e1e6cfe7d4d9 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -208,7 +209,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			*vcpu_pc(vcpu) -= adj;
 		}
 
-		kvm_inject_vabt(vcpu);
+		if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+			u64 disr = kvm_vcpu_get_disr(vcpu);
+
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+		} else {
+			kvm_inject_vabt(vcpu);
+		}
+
 		return 1;
 	}
 
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d6d410..96caa5328b3a 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -124,6 +124,17 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+alternative_if ARM64_HAS_RAS_EXTN
+	// If we have the RAS extensions we can consume a pending error
+	// without an unmask-SError and isb.
+	esb
+	mrs_s	x2, SYS_DISR_EL1
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	msr_s	SYS_DISR_EL1, xzr
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
@@ -135,6 +146,8 @@ ENTRY(__guest_exit)
 
 	dsb	sy		// Synchronize against in-flight ld/st
 	msr	daifclr, #4	// Unmask aborts
+	nop
+alternative_endif
 
 	// This is our single instruction exception window. A pending
 	// SError is guaranteed to occur at the earliest when we unmask
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 10 +++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
 5 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 8a7a838eb17a..8274d16df3cd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -173,6 +173,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9564e6d56b79..b3c53d8b6a60 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -89,6 +89,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088f1e4b..121889c49542 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -130,6 +130,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 345fdbba6c2e..e1e6cfe7d4d9 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -208,7 +209,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			*vcpu_pc(vcpu) -= adj;
 		}
 
-		kvm_inject_vabt(vcpu);
+		if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+			u64 disr = kvm_vcpu_get_disr(vcpu);
+
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+		} else {
+			kvm_inject_vabt(vcpu);
+		}
+
 		return 1;
 	}
 
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d6d410..96caa5328b3a 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -124,6 +124,17 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+alternative_if ARM64_HAS_RAS_EXTN
+	// If we have the RAS extensions we can consume a pending error
+	// without an unmask-SError and isb.
+	esb
+	mrs_s	x2, SYS_DISR_EL1
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	msr_s	SYS_DISR_EL1, xzr
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
@@ -135,6 +146,8 @@ ENTRY(__guest_exit)
 
 	dsb	sy		// Synchronize against in-flight ld/st
 	msr	daifclr, #4	// Unmask aborts
+	nop
+alternative_endif
 
 	// This is our single instruction exception window. A pending
 	// SError is guaranteed to occur@the earliest when we unmask
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 20/20] KVM: arm64: Take any host SError before entering the guest
  2017-10-05 19:17 ` James Morse
@ 2017-10-05 19:18   ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

On VHE systems KVM masks SError before switching the VBAR value. Any
host RAS error that the CPU knew about before world-switch may become
pending as an SError during world-switch, and only be taken once we enter
the guest.

Until KVM can take RAS SErrors during world switch, add an ESB to
force any RAS errors to be synchronised and taken on the host before
we enter world switch.

RAS errors that become pending during world switch are still taken
once we enter the guest.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index b3c53d8b6a60..6a48f9d18c58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -392,6 +392,7 @@ static inline void __cpu_init_stage2(void)
 
 static inline void kvm_arm_vhe_guest_enter(void)
 {
+	esb();
 	local_mask_daif();
 }
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 20/20] KVM: arm64: Take any host SError before entering the guest
@ 2017-10-05 19:18   ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

On VHE systems KVM masks SError before switching the VBAR value. Any
host RAS error that the CPU knew about before world-switch may become
pending as an SError during world-switch, and only be taken once we enter
the guest.

Until KVM can take RAS SErrors during world switch, add an ESB to
force any RAS errors to be synchronised and taken on the host before
we enter world switch.

RAS errors that become pending during world switch are still taken
once we enter the guest.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index b3c53d8b6a60..6a48f9d18c58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -392,6 +392,7 @@ static inline void __cpu_init_stage2(void)
 
 static inline void kvm_arm_vhe_guest_enter(void)
 {
+	esb();
 	local_mask_daif();
 }
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
  2017-10-05 19:18   ` James Morse
@ 2017-10-11  9:01     ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2017-10-11  9:01 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, wangxiongfeng2,
	linux-arm-kernel, kvmarm

Hi James,

On Thu, Oct 05 2017 at  8:18:01 pm BST, James Morse <james.morse@arm.com> wrote:
> Non-VHE systems take an exception to EL2 in order to world-switch into the
> guest. When returning from the guest KVM implicitly restores the DAIF
> flags when it returns to the kernel at EL1.
>
> With VHE none of this exception-level jumping happens, so KVMs
> world-switch code is exposed to the host kernel's DAIF values, and KVM
> spills the guest-exit DAIF values back into the host kernel.
> On entry to a guest we have Debug and SError exceptions unmasked, KVM
> has switched VBAR but isn't prepared to handle these. On guest exit
> Debug exceptions are left disabled once we return to the host and will
> stay this way until we enter user space.
>
> Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
> happen after the hosts VBAR value has been synchronised by the isb in
> __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
> setting KVMs VBAR value, but is kept here for symmetry.
>
> Signed-off-by: James Morse <james.morse@arm.com>
>
> ---
> Give me a kick if you want this reworked as a fix (which will then
> conflict with this series), or a backportable version.
>
>  arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
>  virt/kvm/arm/arm.c                |  4 ++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e923b58606e2..d3eb79a9ed6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -25,6 +25,7 @@
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cpufeature.h>
> +#include <asm/daifflags.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmio.h>
> @@ -384,4 +385,13 @@ static inline void __cpu_init_stage2(void)
>  		  "PARange is %d bits, unsupported configuration!", parange);
>  }
>  
> +static inline void kvm_arm_vhe_guest_enter(void)
> +{
> +	local_mask_daif();
> +}
> +
> +static inline void kvm_arm_vhe_guest_exit(void)
> +{
> +	local_restore_daif(DAIF_PROCCTX_NOIRQ);
> +}
>  #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index b9f68e4add71..665529924b34 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		 */
>  		trace_kvm_entry(*vcpu_pc(vcpu));
>  		guest_enter_irqoff();
> +		if (has_vhe())
> +			kvm_arm_vhe_guest_enter();
>  
>  		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>  
> +		if (has_vhe())
> +			kvm_arm_vhe_guest_exit();
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		vcpu->stat.exits++;
>  		/*

Why is that masking limited to entering/exiting the guest? I would have
though that it would have been put in the kvm_call_hyp helper, in order
to cover all "HYP" accesses. Or is it that you've worked out that only
the guest run actually requires this because none of the other HYP
helpers are changing the flags?

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
@ 2017-10-11  9:01     ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2017-10-11  9:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Thu, Oct 05 2017 at  8:18:01 pm BST, James Morse <james.morse@arm.com> wrote:
> Non-VHE systems take an exception to EL2 in order to world-switch into the
> guest. When returning from the guest KVM implicitly restores the DAIF
> flags when it returns to the kernel at EL1.
>
> With VHE none of this exception-level jumping happens, so KVMs
> world-switch code is exposed to the host kernel's DAIF values, and KVM
> spills the guest-exit DAIF values back into the host kernel.
> On entry to a guest we have Debug and SError exceptions unmasked, KVM
> has switched VBAR but isn't prepared to handle these. On guest exit
> Debug exceptions are left disabled once we return to the host and will
> stay this way until we enter user space.
>
> Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
> happen after the hosts VBAR value has been synchronised by the isb in
> __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
> setting KVMs VBAR value, but is kept here for symmetry.
>
> Signed-off-by: James Morse <james.morse@arm.com>
>
> ---
> Give me a kick if you want this reworked as a fix (which will then
> conflict with this series), or a backportable version.
>
>  arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
>  virt/kvm/arm/arm.c                |  4 ++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e923b58606e2..d3eb79a9ed6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -25,6 +25,7 @@
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cpufeature.h>
> +#include <asm/daifflags.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmio.h>
> @@ -384,4 +385,13 @@ static inline void __cpu_init_stage2(void)
>  		  "PARange is %d bits, unsupported configuration!", parange);
>  }
>  
> +static inline void kvm_arm_vhe_guest_enter(void)
> +{
> +	local_mask_daif();
> +}
> +
> +static inline void kvm_arm_vhe_guest_exit(void)
> +{
> +	local_restore_daif(DAIF_PROCCTX_NOIRQ);
> +}
>  #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index b9f68e4add71..665529924b34 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		 */
>  		trace_kvm_entry(*vcpu_pc(vcpu));
>  		guest_enter_irqoff();
> +		if (has_vhe())
> +			kvm_arm_vhe_guest_enter();
>  
>  		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>  
> +		if (has_vhe())
> +			kvm_arm_vhe_guest_exit();
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		vcpu->stat.exits++;
>  		/*

Why is that masking limited to entering/exiting the guest? I would have
though that it would have been put in the kvm_call_hyp helper, in order
to cover all "HYP" accesses. Or is it that you've worked out that only
the guest run actually requires this because none of the other HYP
helpers are changing the flags?

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2017-10-05 19:18   ` James Morse
@ 2017-10-11 10:37     ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2017-10-11 10:37 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, wangxiongfeng2,
	linux-arm-kernel, kvmarm

On Thu, Oct 05 2017 at  8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
>
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>
> The current SError from EL2 code unmasks SError and tries to fence any
> pending SError into a single instruction window. It then leaves SError
> unmasked.
>
> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
> error, but KVM is only able to handle SError from EL2 if they occur
> during this single instruction window...
>
> The RAS Extensions give us a new instruction to synchronise and
> consume SErrors. The RAS Extensions document (ARM DDI0587),
> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
> SError interrupts generated by 'instructions, translation table walks,
> hardware updates to the translation tables, and instruction fetches on
> the same PE'. This makes ESB equivalent to KVMs existing
> 'dsb, mrs-daifclr, isb' sequence.
>
> Use the alternatives to synchronise and consume any SError using ESB
> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
> in the exit_code so that we can restart the vcpu if it turns out this
> SError has no impact on the vcpu.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  1 +
>  arch/arm64/kernel/asm-offsets.c      |  1 +
>  arch/arm64/kvm/handle_exit.c         | 10 +++++++++-
>  arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
>  5 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 8a7a838eb17a..8274d16df3cd 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -173,6 +173,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
>  	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
>  }
>  
> +static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.fault.disr_el1;
> +}
> +
>  static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9564e6d56b79..b3c53d8b6a60 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -89,6 +89,7 @@ struct kvm_vcpu_fault_info {
>  	u32 esr_el2;		/* Hyp Syndrom Register */
>  	u64 far_el2;		/* Hyp Fault Address Register */
>  	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
> +	u64 disr_el1;		/* Deferred [SError] Status Register */
>  };
>  
>  /*
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 71bf088f1e4b..121889c49542 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -130,6 +130,7 @@ int main(void)
>    BLANK();
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
> +  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
>    DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
>    DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
>    DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 345fdbba6c2e..e1e6cfe7d4d9 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -23,6 +23,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/esr.h>
> +#include <asm/exception.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_coproc.h>
>  #include <asm/kvm_emulate.h>
> @@ -208,7 +209,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  			*vcpu_pc(vcpu) -= adj;
>  		}
>  
> -		kvm_inject_vabt(vcpu);
> +		if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
> +			u64 disr = kvm_vcpu_get_disr(vcpu);
> +
> +			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
> +		} else {
> +			kvm_inject_vabt(vcpu);
> +		}
> +
>  		return 1;
>  	}
>  
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 12ee62d6d410..96caa5328b3a 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>  	// Now restore the host regs
>  	restore_callee_saved_regs x2
>  
> +alternative_if ARM64_HAS_RAS_EXTN
> +	// If we have the RAS extensions we can consume a pending error
> +	// without an unmask-SError and isb.
> +	esb
> +	mrs_s	x2, SYS_DISR_EL1
> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
> +	cbz	x2, 1f
> +	msr_s	SYS_DISR_EL1, xzr
> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
> +1:	ret
> +alternative_else
>  	// If we have a pending asynchronous abort, now is the
>  	// time to find out. From your VAXorcist book, page 666:
>  	// "Threaten me not, oh Evil one!  For I speak with
> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>  
>  	dsb	sy		// Synchronize against in-flight ld/st
>  	msr	daifclr, #4	// Unmask aborts
> +	nop

Oops. You've now introduced an instruction in what was supposed to be a
single instruction window (the isb). It means that we may fail to
identify the Serror as having been generated by our synchronisation
mechanism, and we'll panic for no good reason.

Moving the nop up will solve this.

> +alternative_endif
>  
>  	// This is our single instruction exception window. A pending
>  	// SError is guaranteed to occur at the earliest when we unmask

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
@ 2017-10-11 10:37     ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2017-10-11 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 05 2017 at  8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
>
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>
> The current SError from EL2 code unmasks SError and tries to fence any
> pending SError into a single instruction window. It then leaves SError
> unmasked.
>
> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
> error, but KVM is only able to handle SError from EL2 if they occur
> during this single instruction window...
>
> The RAS Extensions give us a new instruction to synchronise and
> consume SErrors. The RAS Extensions document (ARM DDI0587),
> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
> SError interrupts generated by 'instructions, translation table walks,
> hardware updates to the translation tables, and instruction fetches on
> the same PE'. This makes ESB equivalent to KVMs existing
> 'dsb, mrs-daifclr, isb' sequence.
>
> Use the alternatives to synchronise and consume any SError using ESB
> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
> in the exit_code so that we can restart the vcpu if it turns out this
> SError has no impact on the vcpu.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  1 +
>  arch/arm64/kernel/asm-offsets.c      |  1 +
>  arch/arm64/kvm/handle_exit.c         | 10 +++++++++-
>  arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
>  5 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 8a7a838eb17a..8274d16df3cd 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -173,6 +173,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
>  	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
>  }
>  
> +static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.fault.disr_el1;
> +}
> +
>  static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9564e6d56b79..b3c53d8b6a60 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -89,6 +89,7 @@ struct kvm_vcpu_fault_info {
>  	u32 esr_el2;		/* Hyp Syndrom Register */
>  	u64 far_el2;		/* Hyp Fault Address Register */
>  	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
> +	u64 disr_el1;		/* Deferred [SError] Status Register */
>  };
>  
>  /*
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 71bf088f1e4b..121889c49542 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -130,6 +130,7 @@ int main(void)
>    BLANK();
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
> +  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
>    DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
>    DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
>    DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 345fdbba6c2e..e1e6cfe7d4d9 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -23,6 +23,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/esr.h>
> +#include <asm/exception.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_coproc.h>
>  #include <asm/kvm_emulate.h>
> @@ -208,7 +209,14 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  			*vcpu_pc(vcpu) -= adj;
>  		}
>  
> -		kvm_inject_vabt(vcpu);
> +		if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
> +			u64 disr = kvm_vcpu_get_disr(vcpu);
> +
> +			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
> +		} else {
> +			kvm_inject_vabt(vcpu);
> +		}
> +
>  		return 1;
>  	}
>  
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 12ee62d6d410..96caa5328b3a 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>  	// Now restore the host regs
>  	restore_callee_saved_regs x2
>  
> +alternative_if ARM64_HAS_RAS_EXTN
> +	// If we have the RAS extensions we can consume a pending error
> +	// without an unmask-SError and isb.
> +	esb
> +	mrs_s	x2, SYS_DISR_EL1
> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
> +	cbz	x2, 1f
> +	msr_s	SYS_DISR_EL1, xzr
> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
> +1:	ret
> +alternative_else
>  	// If we have a pending asynchronous abort, now is the
>  	// time to find out. From your VAXorcist book, page 666:
>  	// "Threaten me not, oh Evil one!  For I speak with
> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>  
>  	dsb	sy		// Synchronize against in-flight ld/st
>  	msr	daifclr, #4	// Unmask aborts
> +	nop

Oops. You've now introduced an instruction in what was supposed to be a
single instruction window (the isb). It means that we may fail to
identify the Serror as having been generated by our synchronisation
mechanism, and we'll panic for no good reason.

Moving the nop up will solve this.

> +alternative_endif
>  
>  	// This is our single instruction exception window. A pending
>  	// SError is guaranteed to occur at the earliest when we unmask

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
  2017-10-11  9:01     ` Marc Zyngier
@ 2017-10-11 15:40       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-11 15:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, wangxiongfeng2,
	linux-arm-kernel, kvmarm

Hi Marc,

On 11/10/17 10:01, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at  8:18:01 pm BST, James Morse <james.morse@arm.com> wrote:
>> Non-VHE systems take an exception to EL2 in order to world-switch into the
>> guest. When returning from the guest KVM implicitly restores the DAIF
>> flags when it returns to the kernel at EL1.
>>
>> With VHE none of this exception-level jumping happens, so KVMs
>> world-switch code is exposed to the host kernel's DAIF values, and KVM
>> spills the guest-exit DAIF values back into the host kernel.
>> On entry to a guest we have Debug and SError exceptions unmasked, KVM
>> has switched VBAR but isn't prepared to handle these. On guest exit
>> Debug exceptions are left disabled once we return to the host and will
>> stay this way until we enter user space.
>>
>> Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
>> happen after the hosts VBAR value has been synchronised by the isb in
>> __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
>> setting KVMs VBAR value, but is kept here for symmetry.


>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index b9f68e4add71..665529924b34 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  		 */
>>  		trace_kvm_entry(*vcpu_pc(vcpu));
>>  		guest_enter_irqoff();
>> +		if (has_vhe())
>> +			kvm_arm_vhe_guest_enter();
>>  
>>  		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>  
>> +		if (has_vhe())
>> +			kvm_arm_vhe_guest_exit();
>>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>>  		vcpu->stat.exits++;
>>  		/*

> Why is that masking limited to entering/exiting the guest? I would have
> though that it would have been put in the kvm_call_hyp helper, in order
> to cover all "HYP" accesses.

> Or is it that you've worked out that only
> the guest run actually requires this because none of the other HYP
> helpers are changing the flags?

That too... Christoffer made the case[0] that for VHE the existing 'hyp code'
shouldn't be considered as running in a 'special EL2 mode':
> The rationale being that in the long run we want to keep "jumping to
> hyp" the oddball legacy case, where everything else is just the
> kernel/hypervisor functionality.

This lets us take interrupts out of e.g. __kvm_tlb_flush_local_vmid().

These are the things kvm calls via kvm_call_hyp():
> __kvm_get_mdcr_el2
> __init_stage2_translation
> __kvm_tlb_flush_local_vmid
> __kvm_flush_vm_context
> __kvm_vcpu_run
> __kvm_tlb_flush_vmid
> __kvm_tlb_flush_vmid_ipa
> __vgic_v3_init_lrs
> __vgic_v3_get_ich_vtr_el2
> __vgic_v3_write_vmcr
> __vgic_v3_read_vmcr

These all read/write system-registers, but only __kvm_vcpu_run() manipulates the
flags due to taking an exception to exit the guest.

__kvm_vcpu_run() should also be masking exceptions when it changes VBAR.

Only __kvm_vcpu_run() needs wrapping like this, if any other helper touches the
debug registers or exception-routing I think it would need to do similar for VHE.

(__vgic_v3_get_ich_vtr_el2() is also preemptible, but all it does is read an id
register which looks safe to me...)


Thanks,

James

[0] https://www.spinics.net/lists/arm-kernel/msg603990.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests
@ 2017-10-11 15:40       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-11 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On 11/10/17 10:01, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at  8:18:01 pm BST, James Morse <james.morse@arm.com> wrote:
>> Non-VHE systems take an exception to EL2 in order to world-switch into the
>> guest. When returning from the guest KVM implicitly restores the DAIF
>> flags when it returns to the kernel at EL1.
>>
>> With VHE none of this exception-level jumping happens, so KVMs
>> world-switch code is exposed to the host kernel's DAIF values, and KVM
>> spills the guest-exit DAIF values back into the host kernel.
>> On entry to a guest we have Debug and SError exceptions unmasked, KVM
>> has switched VBAR but isn't prepared to handle these. On guest exit
>> Debug exceptions are left disabled once we return to the host and will
>> stay this way until we enter user space.
>>
>> Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
>> happen after the hosts VBAR value has been synchronised by the isb in
>> __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
>> setting KVMs VBAR value, but is kept here for symmetry.


>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index b9f68e4add71..665529924b34 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -698,9 +698,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  		 */
>>  		trace_kvm_entry(*vcpu_pc(vcpu));
>>  		guest_enter_irqoff();
>> +		if (has_vhe())
>> +			kvm_arm_vhe_guest_enter();
>>  
>>  		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>  
>> +		if (has_vhe())
>> +			kvm_arm_vhe_guest_exit();
>>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>>  		vcpu->stat.exits++;
>>  		/*

> Why is that masking limited to entering/exiting the guest? I would have
> though that it would have been put in the kvm_call_hyp helper, in order
> to cover all "HYP" accesses.

> Or is it that you've worked out that only
> the guest run actually requires this because none of the other HYP
> helpers are changing the flags?

That too... Christoffer made the case[0] that for VHE the existing 'hyp code'
shouldn't be considered as running in a 'special EL2 mode':
> The rationale being that in the long run we want to keep "jumping to
> hyp" the oddball legacy case, where everything else is just the
> kernel/hypervisor functionality.

This lets us take interrupts out of e.g. __kvm_tlb_flush_local_vmid().

These are the things kvm calls via kvm_call_hyp():
> __kvm_get_mdcr_el2
> __init_stage2_translation
> __kvm_tlb_flush_local_vmid
> __kvm_flush_vm_context
> __kvm_vcpu_run
> __kvm_tlb_flush_vmid
> __kvm_tlb_flush_vmid_ipa
> __vgic_v3_init_lrs
> __vgic_v3_get_ich_vtr_el2
> __vgic_v3_write_vmcr
> __vgic_v3_read_vmcr

These all read/write system-registers, but only __kvm_vcpu_run() manipulates the
flags due to taking an exception to exit the guest.

__kvm_vcpu_run() should also be masking exceptions when it changes VBAR.

Only __kvm_vcpu_run() needs wrapping like this, if any other helper touches the
debug registers or exception-routing I think it would need to do similar for VHE.

(__vgic_v3_get_ich_vtr_el2() is also preemptible, but all it does is read an id
register which looks safe to me...)


Thanks,

James

[0] https://www.spinics.net/lists/arm-kernel/msg603990.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions
  2017-10-05 19:17   ` James Morse
@ 2017-10-11 16:30     ` Julien Thierry
  -1 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 16:30 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, wangxiongfeng2

Hi James,

On 05/10/17 20:17, James Morse wrote:
> There are a few places where we want to mask all exceptions. Today we
> do this in a piecemeal fashion, typically we expect the caller to
> have masked irqs and the arch code masks debug exceptions, ignoring
> SError which is probably masked.
> 
> Make it clear that 'mask all exceptions' is the intention by adding
> helpers to do exactly that.
> 
> This will let us unmask SError without having to add 'oh and SError'
> to these paths.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> Remove the disable IRQs comment above cpu_die(): we return from idle via
> cpu_resume. This comment is confusing once the local_irq_disable() call
> is changed.
> 
>   arch/arm64/include/asm/assembler.h | 17 +++++++++++
>   arch/arm64/include/asm/daifflags.h | 59 ++++++++++++++++++++++++++++++++++++++
>   arch/arm64/kernel/hibernate.c      |  5 ++--
>   arch/arm64/kernel/machine_kexec.c  |  4 +--
>   arch/arm64/kernel/smp.c            |  9 ++----
>   arch/arm64/kernel/suspend.c        |  7 +++--
>   arch/arm64/kernel/traps.c          |  3 +-
>   arch/arm64/mm/proc.S               |  9 +++---
>   8 files changed, 94 insertions(+), 19 deletions(-)
>   create mode 100644 arch/arm64/include/asm/daifflags.h

[...]

> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> new file mode 100644
> index 000000000000..fb40da8e1457
> --- /dev/null
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_DAIFFLAGS_H
> +#define __ASM_DAIFFLAGS_H
> +
> +#include <asm/irqflags.h>
> +#include <linux/irqflags.h>
> +
> +/* Mask/unmask/restore all exceptions, including interrupts. */
> +static inline unsigned long local_mask_daif(void)
> +{
> +	unsigned long flags;
> +	asm volatile(
> +		"mrs	%0, daif		// local_mask_daif\n"
> +		"msr	daifset, #0xf"
> +		: "=r" (flags)
> +		:
> +		: "memory");
> +	trace_hardirqs_off();
> +	return flags;
> +}
> +
nit:
Can we call this local_daif_save? (and maybe a version that only 
disables without saving?)

Also, I was wondering why use 'local' as prefix? Is there gonna be a 
similar set of function for arm that could be called by common code? 
(e.g. drivers?)

Otherwise:
Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
@ 2017-10-11 16:30     ` Julien Thierry
  0 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On 05/10/17 20:17, James Morse wrote:
> There are a few places where we want to mask all exceptions. Today we
> do this in a piecemeal fashion, typically we expect the caller to
> have masked irqs and the arch code masks debug exceptions, ignoring
> SError which is probably masked.
> 
> Make it clear that 'mask all exceptions' is the intention by adding
> helpers to do exactly that.
> 
> This will let us unmask SError without having to add 'oh and SError'
> to these paths.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> Remove the disable IRQs comment above cpu_die(): we return from idle via
> cpu_resume. This comment is confusing once the local_irq_disable() call
> is changed.
> 
>   arch/arm64/include/asm/assembler.h | 17 +++++++++++
>   arch/arm64/include/asm/daifflags.h | 59 ++++++++++++++++++++++++++++++++++++++
>   arch/arm64/kernel/hibernate.c      |  5 ++--
>   arch/arm64/kernel/machine_kexec.c  |  4 +--
>   arch/arm64/kernel/smp.c            |  9 ++----
>   arch/arm64/kernel/suspend.c        |  7 +++--
>   arch/arm64/kernel/traps.c          |  3 +-
>   arch/arm64/mm/proc.S               |  9 +++---
>   8 files changed, 94 insertions(+), 19 deletions(-)
>   create mode 100644 arch/arm64/include/asm/daifflags.h

[...]

> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> new file mode 100644
> index 000000000000..fb40da8e1457
> --- /dev/null
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_DAIFFLAGS_H
> +#define __ASM_DAIFFLAGS_H
> +
> +#include <asm/irqflags.h>
> +#include <linux/irqflags.h>
> +
> +/* Mask/unmask/restore all exceptions, including interrupts. */
> +static inline unsigned long local_mask_daif(void)
> +{
> +	unsigned long flags;
> +	asm volatile(
> +		"mrs	%0, daif		// local_mask_daif\n"
> +		"msr	daifset, #0xf"
> +		: "=r" (flags)
> +		:
> +		: "memory");
> +	trace_hardirqs_off();
> +	return flags;
> +}
> +
nit:
Can we call this local_daif_save? (and maybe a version that only 
disables without saving?)

Also, I was wondering why use 'local' as prefix? Is there gonna be a 
similar set of function for arm that could be called by common code? 
(e.g. drivers?)

Otherwise:
Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 02/20] arm64: introduce an order for exceptions
  2017-10-05 19:17   ` James Morse
@ 2017-10-11 17:11     ` Julien Thierry
  -1 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 17:11 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, wangxiongfeng2



On 05/10/17 20:17, James Morse wrote:
> Currently SError is always masked in the kernel. To support RAS exceptions
> using SError on hardware with the v8.2 RAS Extensions we need to unmask
> SError as much as possible.
> 
> Let's define an order for masking and unmasking exceptions. 'dai' is
> memorable and effectively what we have today.
> 
> Disabling debug exceptions should cause all other exceptions to be masked.
> Masking SError should mask irq, but not disable debug exceptions.
> Masking irqs has no side effects for other flags. Keeping to this order
> makes it easier for entry.S to know which exceptions should be unmasked.
> 
> FIQ is never expected, but we mask it when we mask debug exceptions, and
> unmask it at all other times.
> 
> Given masking debug exceptions masks everything, we don't need macros
> to save/restore that bit independently. Remove them and switch the last
> caller over to {en,dis}able_daif().
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Reviewed-by: Julien Thierry <julien.thierry@arm.com>

> ---
>   arch/arm64/include/asm/irqflags.h  | 34 +++++++++++++---------------------
>   arch/arm64/kernel/debug-monitors.c |  5 +++--
>   2 files changed, 16 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index 8c581281fa12..9ecdca7011f0 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -21,6 +21,19 @@
>   #include <asm/ptrace.h>
>   
>   /*
> + * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
> + * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
> + * order:
> + * Masking debug exceptions causes all other exceptions to be masked too/
> + * Masking SError masks irq, but not debug exceptions. Masking irqs has no
> + * side effects for other flags. Keeping to this order makes it easier for
> + * entry.S to know which exceptions should be unmasked.
> + *
> + * FIQ is never expected, but we mask it when we disable debug exceptions, and
> + * unmask it at all other times.
> + */
> +
> +/*
>    * CPU interrupt mask handling.
>    */
>   static inline unsigned long arch_local_irq_save(void)
> @@ -89,26 +102,5 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
>   {
>   	return flags & PSR_I_BIT;
>   }
> -
> -/*
> - * save and restore debug state
> - */
> -#define local_dbg_save(flags)						\
> -	do {								\
> -		typecheck(unsigned long, flags);			\
> -		asm volatile(						\
> -		"mrs    %0, daif		// local_dbg_save\n"	\
> -		"msr    daifset, #8"					\
> -		: "=r" (flags) : : "memory");				\
> -	} while (0)
> -
> -#define local_dbg_restore(flags)					\
> -	do {								\
> -		typecheck(unsigned long, flags);			\
> -		asm volatile(						\
> -		"msr    daif, %0		// local_dbg_restore\n"	\
> -		: : "r" (flags) : "memory");				\
> -	} while (0)
> -
>   #endif
>   #endif
> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> index c7ef99904934..bbfa4978b657 100644
> --- a/arch/arm64/kernel/debug-monitors.c
> +++ b/arch/arm64/kernel/debug-monitors.c
> @@ -30,6 +30,7 @@
>   
>   #include <asm/cpufeature.h>
>   #include <asm/cputype.h>
> +#include <asm/daifflags.h>
>   #include <asm/debug-monitors.h>
>   #include <asm/system_misc.h>
>   
> @@ -46,9 +47,9 @@ u8 debug_monitors_arch(void)
>   static void mdscr_write(u32 mdscr)
>   {
>   	unsigned long flags;
> -	local_dbg_save(flags);
> +	flags = local_mask_daif();
>   	write_sysreg(mdscr, mdscr_el1);
> -	local_dbg_restore(flags);
> +	local_restore_daif(flags);
>   }
>   NOKPROBE_SYMBOL(mdscr_write);
>   
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 02/20] arm64: introduce an order for exceptions
@ 2017-10-11 17:11     ` Julien Thierry
  0 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 17:11 UTC (permalink / raw)
  To: linux-arm-kernel



On 05/10/17 20:17, James Morse wrote:
> Currently SError is always masked in the kernel. To support RAS exceptions
> using SError on hardware with the v8.2 RAS Extensions we need to unmask
> SError as much as possible.
> 
> Let's define an order for masking and unmasking exceptions. 'dai' is
> memorable and effectively what we have today.
> 
> Disabling debug exceptions should cause all other exceptions to be masked.
> Masking SError should mask irq, but not disable debug exceptions.
> Masking irqs has no side effects for other flags. Keeping to this order
> makes it easier for entry.S to know which exceptions should be unmasked.
> 
> FIQ is never expected, but we mask it when we mask debug exceptions, and
> unmask it at all other times.
> 
> Given masking debug exceptions masks everything, we don't need macros
> to save/restore that bit independently. Remove them and switch the last
> caller over to {en,dis}able_daif().
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Reviewed-by: Julien Thierry <julien.thierry@arm.com>

> ---
>   arch/arm64/include/asm/irqflags.h  | 34 +++++++++++++---------------------
>   arch/arm64/kernel/debug-monitors.c |  5 +++--
>   2 files changed, 16 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index 8c581281fa12..9ecdca7011f0 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -21,6 +21,19 @@
>   #include <asm/ptrace.h>
>   
>   /*
> + * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
> + * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
> + * order:
> + * Masking debug exceptions causes all other exceptions to be masked too/
> + * Masking SError masks irq, but not debug exceptions. Masking irqs has no
> + * side effects for other flags. Keeping to this order makes it easier for
> + * entry.S to know which exceptions should be unmasked.
> + *
> + * FIQ is never expected, but we mask it when we disable debug exceptions, and
> + * unmask it at all other times.
> + */
> +
> +/*
>    * CPU interrupt mask handling.
>    */
>   static inline unsigned long arch_local_irq_save(void)
> @@ -89,26 +102,5 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
>   {
>   	return flags & PSR_I_BIT;
>   }
> -
> -/*
> - * save and restore debug state
> - */
> -#define local_dbg_save(flags)						\
> -	do {								\
> -		typecheck(unsigned long, flags);			\
> -		asm volatile(						\
> -		"mrs    %0, daif		// local_dbg_save\n"	\
> -		"msr    daifset, #8"					\
> -		: "=r" (flags) : : "memory");				\
> -	} while (0)
> -
> -#define local_dbg_restore(flags)					\
> -	do {								\
> -		typecheck(unsigned long, flags);			\
> -		asm volatile(						\
> -		"msr    daif, %0		// local_dbg_restore\n"	\
> -		: : "r" (flags) : "memory");				\
> -	} while (0)
> -
>   #endif
>   #endif
> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> index c7ef99904934..bbfa4978b657 100644
> --- a/arch/arm64/kernel/debug-monitors.c
> +++ b/arch/arm64/kernel/debug-monitors.c
> @@ -30,6 +30,7 @@
>   
>   #include <asm/cpufeature.h>
>   #include <asm/cputype.h>
> +#include <asm/daifflags.h>
>   #include <asm/debug-monitors.h>
>   #include <asm/system_misc.h>
>   
> @@ -46,9 +47,9 @@ u8 debug_monitors_arch(void)
>   static void mdscr_write(u32 mdscr)
>   {
>   	unsigned long flags;
> -	local_dbg_save(flags);
> +	flags = local_mask_daif();
>   	write_sysreg(mdscr, mdscr_el1);
> -	local_dbg_restore(flags);
> +	local_restore_daif(flags);
>   }
>   NOKPROBE_SYMBOL(mdscr_write);
>   
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 08/20] arm64: entry.S: convert elX_irq
  2017-10-05 19:18   ` James Morse
@ 2017-10-11 17:13     ` Julien Thierry
  -1 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 17:13 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, wangxiongfeng2



On 05/10/17 20:18, James Morse wrote:
> Following our 'dai' order, irqs should be processed with debug and
> serror exceptions unmasked.
>  > Add a helper to unmask these two, (and fiq for good measure).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>   arch/arm64/include/asm/assembler.h | 4 ++++
>   arch/arm64/kernel/entry.S          | 4 ++--
>   2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index c2a37e2f733c..7ffb2a629dc9 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -54,6 +54,10 @@
>   	msr	daif, \tmp
>   	.endm
>   
> +	.macro enable_da_f
> +	msr	daifclr, #(8 | 4 | 1)
> +	.endm
> +

We use this in irq entries because we are inheriting daif + we want to 
disable irqs while we process irqs right?

I don't know if it's worth adding a comment but I find it easier to 
think about it like this.

Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7):
Reviewed-by: Julien Thierry <julien.thierry@arm.com>

>   /*
>    * Enable and disable interrupts.
>    */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index f7dfe5d2b1fb..df085ec003b0 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -557,7 +557,7 @@ ENDPROC(el1_sync)
>   	.align	6
>   el1_irq:
>   	kernel_entry 1
> -	enable_dbg
> +	enable_da_f
>   #ifdef CONFIG_TRACE_IRQFLAGS
>   	bl	trace_hardirqs_off
>   #endif
> @@ -766,7 +766,7 @@ ENDPROC(el0_sync)
>   el0_irq:
>   	kernel_entry 0
>   el0_irq_naked:
> -	enable_dbg
> +	enable_da_f
>   #ifdef CONFIG_TRACE_IRQFLAGS
>   	bl	trace_hardirqs_off
>   #endif
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 08/20] arm64: entry.S: convert elX_irq
@ 2017-10-11 17:13     ` Julien Thierry
  0 siblings, 0 replies; 74+ messages in thread
From: Julien Thierry @ 2017-10-11 17:13 UTC (permalink / raw)
  To: linux-arm-kernel



On 05/10/17 20:18, James Morse wrote:
> Following our 'dai' order, irqs should be processed with debug and
> serror exceptions unmasked.
>  > Add a helper to unmask these two, (and fiq for good measure).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>   arch/arm64/include/asm/assembler.h | 4 ++++
>   arch/arm64/kernel/entry.S          | 4 ++--
>   2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index c2a37e2f733c..7ffb2a629dc9 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -54,6 +54,10 @@
>   	msr	daif, \tmp
>   	.endm
>   
> +	.macro enable_da_f
> +	msr	daifclr, #(8 | 4 | 1)
> +	.endm
> +

We use this in irq entries because we are inheriting daif + we want to 
disable irqs while we process irqs right?

I don't know if it's worth adding a comment but I find it easier to 
think about it like this.

Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7):
Reviewed-by: Julien Thierry <julien.thierry@arm.com>

>   /*
>    * Enable and disable interrupts.
>    */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index f7dfe5d2b1fb..df085ec003b0 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -557,7 +557,7 @@ ENDPROC(el1_sync)
>   	.align	6
>   el1_irq:
>   	kernel_entry 1
> -	enable_dbg
> +	enable_da_f
>   #ifdef CONFIG_TRACE_IRQFLAGS
>   	bl	trace_hardirqs_off
>   #endif
> @@ -766,7 +766,7 @@ ENDPROC(el0_sync)
>   el0_irq:
>   	kernel_entry 0
>   el0_irq_naked:
> -	enable_dbg
> +	enable_da_f
>   #ifdef CONFIG_TRACE_IRQFLAGS
>   	bl	trace_hardirqs_off
>   #endif
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions
  2017-10-11 16:30     ` Julien Thierry
@ 2017-10-12 12:26       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:26 UTC (permalink / raw)
  To: Julien Thierry
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, wangxiongfeng2, linux-arm-kernel

Hi Julien,

On 11/10/17 17:30, Julien Thierry wrote:
> On 05/10/17 20:17, James Morse wrote:
>> There are a few places where we want to mask all exceptions. Today we
>> do this in a piecemeal fashion, typically we expect the caller to
>> have masked irqs and the arch code masks debug exceptions, ignoring
>> SError which is probably masked.
>>
>> Make it clear that 'mask all exceptions' is the intention by adding
>> helpers to do exactly that.
>>
>> This will let us unmask SError without having to add 'oh and SError'
>> to these paths.

>> diff --git a/arch/arm64/include/asm/daifflags.h
>> b/arch/arm64/include/asm/daifflags.h
>> new file mode 100644
>> index 000000000000..fb40da8e1457
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/daifflags.h

>> +#ifndef __ASM_DAIFFLAGS_H
>> +#define __ASM_DAIFFLAGS_H
>> +
>> +#include <asm/irqflags.h>
>> +#include <linux/irqflags.h>
>> +
>> +/* Mask/unmask/restore all exceptions, including interrupts. */
>> +static inline unsigned long local_mask_daif(void)
>> +{
>> +    unsigned long flags;
>> +    asm volatile(
>> +        "mrs    %0, daif        // local_mask_daif\n"
>> +        "msr    daifset, #0xf"
>> +        : "=r" (flags)
>> +        :
>> +        : "memory");
>> +    trace_hardirqs_off();
>> +    return flags;
>> +}
>> +
> nit:
> Can we call this local_daif_save? 

Sure, having a save version fits better with the C irqflags versions.

mask/unmask is just being pedantic, irqs and SError aren't disabled by these
flags, (but curiously debug is...).


> (and maybe a version that only disables without saving?)

irqflags.h has one, so yes, why not.


> Also, I was wondering why use 'local' as prefix? Is there gonna be a similar set
> of function for arm that could be called by common code? (e.g. drivers?)

Its a hangover from the C irqflags calls like arch_local_save_flags() etc. I
dropped the 'arch' as these should only be called by arch code and 'daif' should
prevent any name collisions.

Drivers should never touch this stuff, if they do its likely to be because they
want to defer bursting-into-flames until we get to userspace, where its harder
to work out which driver to blame.


> Otherwise:
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Thanks!


James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
@ 2017-10-12 12:26       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Julien,

On 11/10/17 17:30, Julien Thierry wrote:
> On 05/10/17 20:17, James Morse wrote:
>> There are a few places where we want to mask all exceptions. Today we
>> do this in a piecemeal fashion, typically we expect the caller to
>> have masked irqs and the arch code masks debug exceptions, ignoring
>> SError which is probably masked.
>>
>> Make it clear that 'mask all exceptions' is the intention by adding
>> helpers to do exactly that.
>>
>> This will let us unmask SError without having to add 'oh and SError'
>> to these paths.

>> diff --git a/arch/arm64/include/asm/daifflags.h
>> b/arch/arm64/include/asm/daifflags.h
>> new file mode 100644
>> index 000000000000..fb40da8e1457
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/daifflags.h

>> +#ifndef __ASM_DAIFFLAGS_H
>> +#define __ASM_DAIFFLAGS_H
>> +
>> +#include <asm/irqflags.h>
>> +#include <linux/irqflags.h>
>> +
>> +/* Mask/unmask/restore all exceptions, including interrupts. */
>> +static inline unsigned long local_mask_daif(void)
>> +{
>> +    unsigned long flags;
>> +    asm volatile(
>> +        "mrs    %0, daif        // local_mask_daif\n"
>> +        "msr    daifset, #0xf"
>> +        : "=r" (flags)
>> +        :
>> +        : "memory");
>> +    trace_hardirqs_off();
>> +    return flags;
>> +}
>> +
> nit:
> Can we call this local_daif_save? 

Sure, having a save version fits better with the C irqflags versions.

mask/unmask is just being pedantic, irqs and SError aren't disabled by these
flags, (but curiously debug is...).


> (and maybe a version that only disables without saving?)

irqflags.h has one, so yes, why not.


> Also, I was wondering why use 'local' as prefix? Is there gonna be a similar set
> of function for arm that could be called by common code? (e.g. drivers?)

Its a hangover from the C irqflags calls like arch_local_save_flags() etc. I
dropped the 'arch' as these should only be called by arch code and 'daif' should
prevent any name collisions.

Drivers should never touch this stuff, if they do its likely to be because they
want to defer bursting-into-flames until we get to userspace, where its harder
to work out which driver to blame.


> Otherwise:
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Thanks!


James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 08/20] arm64: entry.S: convert elX_irq
  2017-10-11 17:13     ` Julien Thierry
@ 2017-10-12 12:26       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:26 UTC (permalink / raw)
  To: Julien Thierry
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, wangxiongfeng2, linux-arm-kernel

Hi Julien,

On 11/10/17 18:13, Julien Thierry wrote:
> On 05/10/17 20:18, James Morse wrote:
>> Following our 'dai' order, irqs should be processed with debug and
>> serror exceptions unmasked.
>>  > Add a helper to unmask these two, (and fiq for good measure).
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> ---
>>   arch/arm64/include/asm/assembler.h | 4 ++++
>>   arch/arm64/kernel/entry.S          | 4 ++--
>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h
>> b/arch/arm64/include/asm/assembler.h
>> index c2a37e2f733c..7ffb2a629dc9 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -54,6 +54,10 @@
>>       msr    daif, \tmp
>>       .endm
>>   +    .macro enable_da_f
>> +    msr    daifclr, #(8 | 4 | 1)
>> +    .endm
>> +
> 
> We use this in irq entries because we are inheriting daif + we want to disable
> irqs while we process irqs right?

In this case not inheriting, we only do that for synchronous exceptions because
we can't mask them, keeping the flags 'the same' lets us ignore them.

Here we are unconditionally unmasking things for handling interrupts. If we
stick with this dai order we know its safe to unmask SError and enable Debug.


> I don't know if it's worth adding a comment but I find it easier to think about
> it like this.

If its not-obvious, there should be a comment:
/* IRQ is the lowest priority flag, unconditionally unmask the rest. */


(I was expecting flames for the hangman style naming!)

> Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7):
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Thanks!

I'll post a v4 ~tomorrow with this and the renaming changes.



James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 08/20] arm64: entry.S: convert elX_irq
@ 2017-10-12 12:26       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Julien,

On 11/10/17 18:13, Julien Thierry wrote:
> On 05/10/17 20:18, James Morse wrote:
>> Following our 'dai' order, irqs should be processed with debug and
>> serror exceptions unmasked.
>>  > Add a helper to unmask these two, (and fiq for good measure).
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> ---
>>   arch/arm64/include/asm/assembler.h | 4 ++++
>>   arch/arm64/kernel/entry.S          | 4 ++--
>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h
>> b/arch/arm64/include/asm/assembler.h
>> index c2a37e2f733c..7ffb2a629dc9 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -54,6 +54,10 @@
>>       msr    daif, \tmp
>>       .endm
>>   +    .macro enable_da_f
>> +    msr    daifclr, #(8 | 4 | 1)
>> +    .endm
>> +
> 
> We use this in irq entries because we are inheriting daif + we want to disable
> irqs while we process irqs right?

In this case not inheriting, we only do that for synchronous exceptions because
we can't mask them, keeping the flags 'the same' lets us ignore them.

Here we are unconditionally unmasking things for handling interrupts. If we
stick with this dai order we know its safe to unmask SError and enable Debug.


> I don't know if it's worth adding a comment but I find it easier to think about
> it like this.

If its not-obvious, there should be a comment:
/* IRQ is the lowest priority flag, unconditionally unmask the rest. */


(I was expecting flames for the hangman style naming!)

> Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7):
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>

Thanks!

I'll post a v4 ~tomorrow with this and the renaming changes.



James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2017-10-11 10:37     ` Marc Zyngier
@ 2017-10-12 12:28       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, wangxiongfeng2,
	linux-arm-kernel, kvmarm

Hi Marc,

On 11/10/17 11:37, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at  8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> The current SError from EL2 code unmasks SError and tries to fence any
>> pending SError into a single instruction window. It then leaves SError
>> unmasked.
>>
>> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
>> error, but KVM is only able to handle SError from EL2 if they occur
>> during this single instruction window...
>>
>> The RAS Extensions give us a new instruction to synchronise and
>> consume SErrors. The RAS Extensions document (ARM DDI0587),
>> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
>> SError interrupts generated by 'instructions, translation table walks,
>> hardware updates to the translation tables, and instruction fetches on
>> the same PE'. This makes ESB equivalent to KVMs existing
>> 'dsb, mrs-daifclr, isb' sequence.
>>
>> Use the alternatives to synchronise and consume any SError using ESB
>> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
>> in the exit_code so that we can restart the vcpu if it turns out this
>> SError has no impact on the vcpu.

>> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
>> index 12ee62d6d410..96caa5328b3a 100644
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>>  	// Now restore the host regs
>>  	restore_callee_saved_regs x2
>>  
>> +alternative_if ARM64_HAS_RAS_EXTN
>> +	// If we have the RAS extensions we can consume a pending error
>> +	// without an unmask-SError and isb.
>> +	esb
>> +	mrs_s	x2, SYS_DISR_EL1
>> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
>> +	cbz	x2, 1f
>> +	msr_s	SYS_DISR_EL1, xzr
>> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
>> +1:	ret
>> +alternative_else
>>  	// If we have a pending asynchronous abort, now is the
>>  	// time to find out. From your VAXorcist book, page 666:
>>  	// "Threaten me not, oh Evil one!  For I speak with
>> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>>  
>>  	dsb	sy		// Synchronize against in-flight ld/st
>>  	msr	daifclr, #4	// Unmask aborts
>> +	nop
> 
> Oops. You've now introduced an instruction in what was supposed to be a
> single instruction window (the isb). It means that we may fail to
> identify the Serror as having been generated by our synchronisation
> mechanism, and we'll panic for no good reason.

Gah! and I pointed this thing out in the commit message,

>> +alternative_endif
>>  
>>  	// This is our single instruction exception window. A pending
>>  	// SError is guaranteed to occur at the earliest when we unmask

and here it is in the diff.


> Moving the nop up will solve this.

Yes.

I've fixed this for v4, along with Julien's suggestions.


Thanks!

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
@ 2017-10-12 12:28       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-12 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On 11/10/17 11:37, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at  8:18:11 pm BST, James Morse <james.morse@arm.com> wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> The current SError from EL2 code unmasks SError and tries to fence any
>> pending SError into a single instruction window. It then leaves SError
>> unmasked.
>>
>> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
>> error, but KVM is only able to handle SError from EL2 if they occur
>> during this single instruction window...
>>
>> The RAS Extensions give us a new instruction to synchronise and
>> consume SErrors. The RAS Extensions document (ARM DDI0587),
>> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
>> SError interrupts generated by 'instructions, translation table walks,
>> hardware updates to the translation tables, and instruction fetches on
>> the same PE'. This makes ESB equivalent to KVMs existing
>> 'dsb, mrs-daifclr, isb' sequence.
>>
>> Use the alternatives to synchronise and consume any SError using ESB
>> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
>> in the exit_code so that we can restart the vcpu if it turns out this
>> SError has no impact on the vcpu.

>> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
>> index 12ee62d6d410..96caa5328b3a 100644
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>>  	// Now restore the host regs
>>  	restore_callee_saved_regs x2
>>  
>> +alternative_if ARM64_HAS_RAS_EXTN
>> +	// If we have the RAS extensions we can consume a pending error
>> +	// without an unmask-SError and isb.
>> +	esb
>> +	mrs_s	x2, SYS_DISR_EL1
>> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
>> +	cbz	x2, 1f
>> +	msr_s	SYS_DISR_EL1, xzr
>> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
>> +1:	ret
>> +alternative_else
>>  	// If we have a pending asynchronous abort, now is the
>>  	// time to find out. From your VAXorcist book, page 666:
>>  	// "Threaten me not, oh Evil one!  For I speak with
>> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>>  
>>  	dsb	sy		// Synchronize against in-flight ld/st
>>  	msr	daifclr, #4	// Unmask aborts
>> +	nop
> 
> Oops. You've now introduced an instruction in what was supposed to be a
> single instruction window (the isb). It means that we may fail to
> identify the Serror as having been generated by our synchronisation
> mechanism, and we'll panic for no good reason.

Gah! and I pointed this thing out in the commit message,

>> +alternative_endif
>>  
>>  	// This is our single instruction exception window. A pending
>>  	// SError is guaranteed to occur at the earliest when we unmask

and here it is in the diff.


> Moving the nop up will solve this.

Yes.

I've fixed this for v4, along with Julien's suggestions.


Thanks!

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-05 19:18   ` James Morse
@ 2017-10-13  9:25     ` gengdongjiu
  -1 siblings, 0 replies; 74+ messages in thread
From: gengdongjiu @ 2017-10-13  9:25 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, kvmarm

Hi James,

   After checking this patch, I think my patch[1] already include this logic(only a little difference). In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
other people disagree that[3][4],and propose to set its value and injection by userspace(when RAS is enabled). If you think we also need to support injection by KVM, I can extend my
patch to support that(but I think we should not support according previous review comments). So I think we no need to submit another patch, it will be duplicated, and waste our review
time. thank you very much. I will combine that.

[1] https://lkml.org/lkml/2017/8/28/497
[2] https://patchwork.kernel.org/patch/9633105/
[3]https://lkml.org/lkml/2017/3/20/441
[4]https://lkml.org/lkml/2017/3/20/516


On 2017/10/6 3:18, James Morse wrote:
> Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
> generated an SError with an implementation defined ESR_EL1.ISS, because we
> had no mechanism to specify the ESR value.
> 
> On Juno this generates an all-zero ESR, the most significant bit 'ISV'
> is clear indicating the remainder of the ISS field is invalid.
> 
> With the RAS Extensions we have a mechanism to specify this value, and the
> most significant bit has a new meaning: 'IDS - Implementation Defined
> Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
> instead of 'no valid ISS'.
> 
> Add KVM support for the VSESR_EL2 register to specify an ESR value when
> HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
> specify an implementation-defined value.
> 
> We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
> save/restores this bit during __deactivate_traps() and hardware clears the
> bit once the guest has consumed the virtual-SError.
> 
> Future patches may add an API (or KVM CAP) to pend a virtual SError with
> a specified ESR.
> 
> Cc: Dongjiu Geng <gengdongjiu@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  3 +++
>  arch/arm64/include/asm/sysreg.h      |  1 +
>  arch/arm64/kvm/hyp/switch.c          |  4 ++++
>  arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
>  5 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index e5df3fce0008..8a7a838eb17a 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
>  	vcpu->arch.hcr_el2 = hcr;
>  }
>  
> +static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
> +{
> +	vcpu->arch.vsesr_el2 = vsesr;
> +}
> +
>  static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
>  {
>  	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d3eb79a9ed6b..0af35e71fedb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Detect first run of a vcpu */
>  	bool has_run_once;
> +
> +	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
> +	u64 vsesr_el2;
>  };
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 427c36bc5dd6..a493e93de296 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -253,6 +253,7 @@
>  
>  #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
>  #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
> +#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
>  #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
>  
>  #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..af37658223a0 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
>  		isb();
>  	}
>  	write_sysreg(val, hcr_el2);
> +
> +	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
> +		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
> +
>  	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
>  	write_sysreg(1 << 15, hstr_el2);
>  	/*
> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index da6a8cfa54a0..52f7f66f1356 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>  		inject_undef64(vcpu);
>  }
>  
> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +{
> +	vcpu_set_vsesr(vcpu, esr);
> +	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +}
> +
>  /**
>   * kvm_inject_vabt - inject an async abort / SError into the guest
>   * @vcpu: The VCPU to receive the exception
>   *
>   * It is assumed that this code is called from the VCPU thread and that the
>   * VCPU therefore is not currently executing guest code.
> + *
> + * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
> + * the remaining ISS all-zeros so that this error is not interpreted as an
> + * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
> + * value, so the CPU generates an imp-def value.
>   */
>  void kvm_inject_vabt(struct kvm_vcpu *vcpu)
>  {
> -	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +	pend_guest_serror(vcpu, ESR_ELx_ISV);
>  }
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2017-10-13  9:25     ` gengdongjiu
  0 siblings, 0 replies; 74+ messages in thread
From: gengdongjiu @ 2017-10-13  9:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

   After checking this patch, I think my patch[1] already include this logic(only a little difference). In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
other people disagree that[3][4],and propose to set its value and injection by userspace(when RAS is enabled). If you think we also need to support injection by KVM, I can extend my
patch to support that(but I think we should not support according previous review comments). So I think we no need to submit another patch, it will be duplicated, and waste our review
time. thank you very much. I will combine that.

[1] https://lkml.org/lkml/2017/8/28/497
[2] https://patchwork.kernel.org/patch/9633105/
[3]https://lkml.org/lkml/2017/3/20/441
[4]https://lkml.org/lkml/2017/3/20/516


On 2017/10/6 3:18, James Morse wrote:
> Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
> generated an SError with an implementation defined ESR_EL1.ISS, because we
> had no mechanism to specify the ESR value.
> 
> On Juno this generates an all-zero ESR, the most significant bit 'ISV'
> is clear indicating the remainder of the ISS field is invalid.
> 
> With the RAS Extensions we have a mechanism to specify this value, and the
> most significant bit has a new meaning: 'IDS - Implementation Defined
> Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
> instead of 'no valid ISS'.
> 
> Add KVM support for the VSESR_EL2 register to specify an ESR value when
> HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
> specify an implementation-defined value.
> 
> We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
> save/restores this bit during __deactivate_traps() and hardware clears the
> bit once the guest has consumed the virtual-SError.
> 
> Future patches may add an API (or KVM CAP) to pend a virtual SError with
> a specified ESR.
> 
> Cc: Dongjiu Geng <gengdongjiu@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  3 +++
>  arch/arm64/include/asm/sysreg.h      |  1 +
>  arch/arm64/kvm/hyp/switch.c          |  4 ++++
>  arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
>  5 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index e5df3fce0008..8a7a838eb17a 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
>  	vcpu->arch.hcr_el2 = hcr;
>  }
>  
> +static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
> +{
> +	vcpu->arch.vsesr_el2 = vsesr;
> +}
> +
>  static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
>  {
>  	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d3eb79a9ed6b..0af35e71fedb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -277,6 +277,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Detect first run of a vcpu */
>  	bool has_run_once;
> +
> +	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
> +	u64 vsesr_el2;
>  };
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 427c36bc5dd6..a493e93de296 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -253,6 +253,7 @@
>  
>  #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
>  #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
> +#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
>  #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
>  
>  #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..af37658223a0 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
>  		isb();
>  	}
>  	write_sysreg(val, hcr_el2);
> +
> +	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
> +		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
> +
>  	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
>  	write_sysreg(1 << 15, hstr_el2);
>  	/*
> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index da6a8cfa54a0..52f7f66f1356 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>  		inject_undef64(vcpu);
>  }
>  
> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +{
> +	vcpu_set_vsesr(vcpu, esr);
> +	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +}
> +
>  /**
>   * kvm_inject_vabt - inject an async abort / SError into the guest
>   * @vcpu: The VCPU to receive the exception
>   *
>   * It is assumed that this code is called from the VCPU thread and that the
>   * VCPU therefore is not currently executing guest code.
> + *
> + * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
> + * the remaining ISS all-zeros so that this error is not interpreted as an
> + * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
> + * value, so the CPU generates an imp-def value.
>   */
>  void kvm_inject_vabt(struct kvm_vcpu *vcpu)
>  {
> -	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +	pend_guest_serror(vcpu, ESR_ELx_ISV);
>  }
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2017-10-13  9:25     ` gengdongjiu
@ 2017-10-13 16:53       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-13 16:53 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	wangxiongfeng2, linux-arm-kernel, kvmarm

Hi gengdongjiu,

On 13/10/17 10:25, gengdongjiu wrote:
> After checking this patch, I think my patch[1] already include this logic(only a little
> difference).

Your kvm_handle_guest_sei() is similar to where this series ends up, but the
purpose of this patch is to keep KVMs existing behaviour.

KVM already injects SError into the guest all by itself, now with the RAS
extensions it can specify and ESR, and because of the new ESR encoding it can't
use the reset value of all-zeroes.


> In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
> other people disagree that[3][4],and propose to set its value and injection by userspace(when
> RAS is enabled). 

Not quite: for RAS errors.
When we want to hand a RAS error to a guest, Qemu should be driving that.

What about impdef SError? Qemu should be able to drive that with the same API.

What about this nasty corner where KVM already injects an impdef SError
directly? This patch keeps that working.


I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we
replace it with? It needs to be a guest exit type that existing software can't
ignore...

(The best I can suggest is: Once we have a mechanism to inject SError into a
guest from Qemu, KVM could make an impdef SError pending, then give Qemu the
opportunity to kill the guest, or set a different ESR. Existing software can
ignore the exit, and take the existing behaviour.)


> So I think we no need to submit another patch, it will be duplicated, and waste our review
> time. thank you very much. I will combine that.

I agree we're posting competing series, there was some off-list co-ordination on
this with Xie XiuQi and Xiongfeng Wang in ~may, it looks like you weren't
involved at that point.

In your last series touching all this:
https://lkml.org/lkml/2017/8/31/698

You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework
underneath it. Applied like this SError is still always masked in the kernel, so
any system without firmware-first will silently consume and discard an
uncontained-RAS-error using the esb() in __switch_to(). We can't do this, hence
the first half of this series.


James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2017-10-13 16:53       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-13 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi gengdongjiu,

On 13/10/17 10:25, gengdongjiu wrote:
> After checking this patch, I think my patch[1] already include this logic(only a little
> difference).

Your kvm_handle_guest_sei() is similar to where this series ends up, but the
purpose of this patch is to keep KVMs existing behaviour.

KVM already injects SError into the guest all by itself, now with the RAS
extensions it can specify and ESR, and because of the new ESR encoding it can't
use the reset value of all-zeroes.


> In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and
> other people disagree that[3][4],and propose to set its value and injection by userspace(when
> RAS is enabled). 

Not quite: for RAS errors.
When we want to hand a RAS error to a guest, Qemu should be driving that.

What about impdef SError? Qemu should be able to drive that with the same API.

What about this nasty corner where KVM already injects an impdef SError
directly? This patch keeps that working.


I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we
replace it with? It needs to be a guest exit type that existing software can't
ignore...

(The best I can suggest is: Once we have a mechanism to inject SError into a
guest from Qemu, KVM could make an impdef SError pending, then give Qemu the
opportunity to kill the guest, or set a different ESR. Existing software can
ignore the exit, and take the existing behaviour.)


> So I think we no need to submit another patch, it will be duplicated, and waste our review
> time. thank you very much. I will combine that.

I agree we're posting competing series, there was some off-list co-ordination on
this with Xie XiuQi and Xiongfeng Wang in ~may, it looks like you weren't
involved at that point.

In your last series touching all this:
https://lkml.org/lkml/2017/8/31/698

You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework
underneath it. Applied like this SError is still always masked in the kernel, so
any system without firmware-first will silently consume and discard an
uncontained-RAS-error using the esb() in __switch_to(). We can't do this, hence
the first half of this series.


James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions
  2017-10-05 19:17   ` James Morse
@ 2017-10-18 14:23     ` Catalin Marinas
  -1 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 14:23 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, kvmarm,
	wangxiongfeng2, linux-arm-kernel

On Thu, Oct 05, 2017 at 08:17:53PM +0100, James Morse wrote:
> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> new file mode 100644
> index 000000000000..fb40da8e1457
> --- /dev/null
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_DAIFFLAGS_H
> +#define __ASM_DAIFFLAGS_H
> +
> +#include <asm/irqflags.h>
> +#include <linux/irqflags.h>

Nitpick: linux/irqflags.h already includes asm/irqflags.h.

We currently have local_dbg_save in asm/irqflags.h. Shall we keep all of
them in the same file (either asm/irqflags.h or the new asm/daifflags.h;
slight preference for irqflags.h unless you are worried about drivers
abusing these).

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
@ 2017-10-18 14:23     ` Catalin Marinas
  0 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 05, 2017 at 08:17:53PM +0100, James Morse wrote:
> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> new file mode 100644
> index 000000000000..fb40da8e1457
> --- /dev/null
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_DAIFFLAGS_H
> +#define __ASM_DAIFFLAGS_H
> +
> +#include <asm/irqflags.h>
> +#include <linux/irqflags.h>

Nitpick: linux/irqflags.h already includes asm/irqflags.h.

We currently have local_dbg_save in asm/irqflags.h. Shall we keep all of
them in the same file (either asm/irqflags.h or the new asm/daifflags.h;
slight preference for irqflags.h unless you are worried about drivers
abusing these).

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions
  2017-10-18 14:23     ` Catalin Marinas
@ 2017-10-18 14:25       ` Catalin Marinas
  -1 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 14:25 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, wangxiongfeng2,
	linux-arm-kernel, kvmarm

On Wed, Oct 18, 2017 at 03:23:16PM +0100, Catalin Marinas wrote:
> On Thu, Oct 05, 2017 at 08:17:53PM +0100, James Morse wrote:
> > diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> > new file mode 100644
> > index 000000000000..fb40da8e1457
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/daifflags.h
> > @@ -0,0 +1,59 @@
> > +/*
> > + * Copyright (C) 2017 ARM Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +#ifndef __ASM_DAIFFLAGS_H
> > +#define __ASM_DAIFFLAGS_H
> > +
> > +#include <asm/irqflags.h>
> > +#include <linux/irqflags.h>
> 
> Nitpick: linux/irqflags.h already includes asm/irqflags.h.
> 
> We currently have local_dbg_save in asm/irqflags.h. Shall we keep all of
> them in the same file (either asm/irqflags.h or the new asm/daifflags.h;
> slight preference for irqflags.h unless you are worried about drivers
> abusing these).

Ah, you removed local_dbg_* in the following patch, so please ignore
this.

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/20] arm64: explicitly mask all exceptions
@ 2017-10-18 14:25       ` Catalin Marinas
  0 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 18, 2017 at 03:23:16PM +0100, Catalin Marinas wrote:
> On Thu, Oct 05, 2017 at 08:17:53PM +0100, James Morse wrote:
> > diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> > new file mode 100644
> > index 000000000000..fb40da8e1457
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/daifflags.h
> > @@ -0,0 +1,59 @@
> > +/*
> > + * Copyright (C) 2017 ARM Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +#ifndef __ASM_DAIFFLAGS_H
> > +#define __ASM_DAIFFLAGS_H
> > +
> > +#include <asm/irqflags.h>
> > +#include <linux/irqflags.h>
> 
> Nitpick: linux/irqflags.h already includes asm/irqflags.h.
> 
> We currently have local_dbg_save in asm/irqflags.h. Shall we keep all of
> them in the same file (either asm/irqflags.h or the new asm/daifflags.h;
> slight preference for irqflags.h unless you are worried about drivers
> abusing these).

Ah, you removed local_dbg_* in the following patch, so please ignore
this.

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
  2017-10-05 19:18   ` James Morse
@ 2017-10-18 16:43     ` Catalin Marinas
  -1 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 16:43 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, kvmarm,
	wangxiongfeng2, linux-arm-kernel

On Thu, Oct 05, 2017 at 08:18:05PM +0100, James Morse wrote:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b68f5e93baac..29df2a93688c 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
>  	  and access the new registers if the system supports the extension.
>  	  Platform RAS features may additionally depend on firmware support.
>  
> +config ARM64_IESB
> +	bool "Enable Implicit Error Synchronization Barrier (IESB)"
> +	default y
> +	depends on ARM64_RAS_EXTN
> +	help
> +	  ARM v8.2 adds a feature to add implicit error synchronization
> +	  barriers whenever the CPU enters or exits a particular exception
> +	  level.
> +
> +	  On CPUs with this feature and the 'RAS Extensions' feature, we can
> +	  use this to contain detected (but not yet reported) errors to the
> +	  relevant exception level.
> +
> +	  The feature is detected at runtime, selecting this option will
> +	  enable these implicit barriers if the CPU supports the feature.
>  endmenu

What's the use-case for not having this option always enabled?

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
@ 2017-10-18 16:43     ` Catalin Marinas
  0 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 05, 2017 at 08:18:05PM +0100, James Morse wrote:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b68f5e93baac..29df2a93688c 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
>  	  and access the new registers if the system supports the extension.
>  	  Platform RAS features may additionally depend on firmware support.
>  
> +config ARM64_IESB
> +	bool "Enable Implicit Error Synchronization Barrier (IESB)"
> +	default y
> +	depends on ARM64_RAS_EXTN
> +	help
> +	  ARM v8.2 adds a feature to add implicit error synchronization
> +	  barriers whenever the CPU enters or exits a particular exception
> +	  level.
> +
> +	  On CPUs with this feature and the 'RAS Extensions' feature, we can
> +	  use this to contain detected (but not yet reported) errors to the
> +	  relevant exception level.
> +
> +	  The feature is detected at runtime, selecting this option will
> +	  enable these implicit barriers if the CPU supports the feature.
>  endmenu

What's the use-case for not having this option always enabled?

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support
  2017-10-05 19:17 ` James Morse
@ 2017-10-18 16:55   ` Catalin Marinas
  -1 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 16:55 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, kvmarm,
	wangxiongfeng2, linux-arm-kernel

On Thu, Oct 05, 2017 at 08:17:52PM +0100, James Morse wrote:
> James Morse (18):
>   arm64: explicitly mask all exceptions
>   arm64: introduce an order for exceptions
>   arm64: Move the async/fiq helpers to explicitly set process context
>     flags
>   arm64: Mask all exceptions during kernel_exit
>   arm64: entry.S: Remove disable_dbg
>   arm64: entry.S: convert el1_sync
>   arm64: entry.S convert el0_sync
>   arm64: entry.S: convert elX_irq
>   KVM: arm/arm64: mask/unmask daif around VHE guests
>   arm64: kernel: Survive corrected RAS errors notified by SError
>   arm64: cpufeature: Enable IESB on exception entry/return for
>     firmware-first
>   arm64: kernel: Prepare for a DISR user
>   KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
>   KVM: arm64: Save/Restore guest DISR_EL1
>   KVM: arm64: Save ESR_EL2 on guest SError
>   KVM: arm64: Handle RAS SErrors from EL1 on guest exit
>   KVM: arm64: Handle RAS SErrors from EL2 on guest exit
>   KVM: arm64: Take any host SError before entering the guest
> 
> Xie XiuQi (2):
>   arm64: entry.S: move SError handling into a C function for future
>     expansion
>   arm64: cpufeature: Detect CPU RAS Extentions

For patches 1-14 (not the KVM ones):

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

I'm also in favour of renaming the macros to local_daif_save etc. as per
Julien's suggestion.

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support
@ 2017-10-18 16:55   ` Catalin Marinas
  0 siblings, 0 replies; 74+ messages in thread
From: Catalin Marinas @ 2017-10-18 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 05, 2017 at 08:17:52PM +0100, James Morse wrote:
> James Morse (18):
>   arm64: explicitly mask all exceptions
>   arm64: introduce an order for exceptions
>   arm64: Move the async/fiq helpers to explicitly set process context
>     flags
>   arm64: Mask all exceptions during kernel_exit
>   arm64: entry.S: Remove disable_dbg
>   arm64: entry.S: convert el1_sync
>   arm64: entry.S convert el0_sync
>   arm64: entry.S: convert elX_irq
>   KVM: arm/arm64: mask/unmask daif around VHE guests
>   arm64: kernel: Survive corrected RAS errors notified by SError
>   arm64: cpufeature: Enable IESB on exception entry/return for
>     firmware-first
>   arm64: kernel: Prepare for a DISR user
>   KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
>   KVM: arm64: Save/Restore guest DISR_EL1
>   KVM: arm64: Save ESR_EL2 on guest SError
>   KVM: arm64: Handle RAS SErrors from EL1 on guest exit
>   KVM: arm64: Handle RAS SErrors from EL2 on guest exit
>   KVM: arm64: Take any host SError before entering the guest
> 
> Xie XiuQi (2):
>   arm64: entry.S: move SError handling into a C function for future
>     expansion
>   arm64: cpufeature: Detect CPU RAS Extentions

For patches 1-14 (not the KVM ones):

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

I'm also in favour of renaming the macros to local_daif_save etc. as per
Julien's suggestion.

-- 
Catalin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
  2017-10-18 16:43     ` Catalin Marinas
@ 2017-10-18 17:14       ` James Morse
  -1 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-18 17:14 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, kvmarm,
	wangxiongfeng2, linux-arm-kernel

Hi Catalin,

On 18/10/17 17:43, Catalin Marinas wrote:
> On Thu, Oct 05, 2017 at 08:18:05PM +0100, James Morse wrote:
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b68f5e93baac..29df2a93688c 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
>>  	  and access the new registers if the system supports the extension.
>>  	  Platform RAS features may additionally depend on firmware support.
>>  
>> +config ARM64_IESB
>> +	bool "Enable Implicit Error Synchronization Barrier (IESB)"
>> +	default y
>> +	depends on ARM64_RAS_EXTN
>> +	help
>> +	  ARM v8.2 adds a feature to add implicit error synchronization
>> +	  barriers whenever the CPU enters or exits a particular exception
>> +	  level.
>> +
>> +	  On CPUs with this feature and the 'RAS Extensions' feature, we can
>> +	  use this to contain detected (but not yet reported) errors to the
>> +	  relevant exception level.
>> +
>> +	  The feature is detected at runtime, selecting this option will
>> +	  enable these implicit barriers if the CPU supports the feature.
>>  endmenu
> 
> What's the use-case for not having this option always enabled?

I don't think there is a strong reason. It ended up with a Kconfig entry just
because its a separate cpufeature entry. I will merge it with the ARM64_RAS_EXTN.

The only reason I can think to turn it off is if its implemented but expensive
on some system, and the EL3/Secure RAS firmware policy stuff doesn't care
whether RAS errors cross exception boundaries.


Thanks,

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first
@ 2017-10-18 17:14       ` James Morse
  0 siblings, 0 replies; 74+ messages in thread
From: James Morse @ 2017-10-18 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 18/10/17 17:43, Catalin Marinas wrote:
> On Thu, Oct 05, 2017 at 08:18:05PM +0100, James Morse wrote:
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b68f5e93baac..29df2a93688c 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -989,6 +989,21 @@ config ARM64_RAS_EXTN
>>  	  and access the new registers if the system supports the extension.
>>  	  Platform RAS features may additionally depend on firmware support.
>>  
>> +config ARM64_IESB
>> +	bool "Enable Implicit Error Synchronization Barrier (IESB)"
>> +	default y
>> +	depends on ARM64_RAS_EXTN
>> +	help
>> +	  ARM v8.2 adds a feature to add implicit error synchronization
>> +	  barriers whenever the CPU enters or exits a particular exception
>> +	  level.
>> +
>> +	  On CPUs with this feature and the 'RAS Extensions' feature, we can
>> +	  use this to contain detected (but not yet reported) errors to the
>> +	  relevant exception level.
>> +
>> +	  The feature is detected at runtime, selecting this option will
>> +	  enable these implicit barriers if the CPU supports the feature.
>>  endmenu
> 
> What's the use-case for not having this option always enabled?

I don't think there is a strong reason. It ended up with a Kconfig entry just
because its a separate cpufeature entry. I will merge it with the ARM64_RAS_EXTN.

The only reason I can think to turn it off is if its implemented but expensive
on some system, and the EL3/Secure RAS firmware policy stuff doesn't care
whether RAS errors cross exception boundaries.


Thanks,

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2017-10-18 17:15 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse
2017-10-05 19:17 ` James Morse
2017-10-05 19:17 ` [PATCH v3 01/20] arm64: explicitly mask all exceptions James Morse
2017-10-05 19:17   ` James Morse
2017-10-11 16:30   ` Julien Thierry
2017-10-11 16:30     ` Julien Thierry
2017-10-12 12:26     ` James Morse
2017-10-12 12:26       ` James Morse
2017-10-18 14:23   ` Catalin Marinas
2017-10-18 14:23     ` Catalin Marinas
2017-10-18 14:25     ` Catalin Marinas
2017-10-18 14:25       ` Catalin Marinas
2017-10-05 19:17 ` [PATCH v3 02/20] arm64: introduce an order for exceptions James Morse
2017-10-05 19:17   ` James Morse
2017-10-11 17:11   ` Julien Thierry
2017-10-11 17:11     ` Julien Thierry
2017-10-05 19:17 ` [PATCH v3 03/20] arm64: Move the async/fiq helpers to explicitly set process context flags James Morse
2017-10-05 19:17   ` James Morse
2017-10-05 19:17 ` [PATCH v3 04/20] arm64: Mask all exceptions during kernel_exit James Morse
2017-10-05 19:17   ` James Morse
2017-10-05 19:17 ` [PATCH v3 05/20] arm64: entry.S: Remove disable_dbg James Morse
2017-10-05 19:17   ` James Morse
2017-10-05 19:17 ` [PATCH v3 06/20] arm64: entry.S: convert el1_sync James Morse
2017-10-05 19:17   ` James Morse
2017-10-05 19:17 ` [PATCH v3 07/20] arm64: entry.S convert el0_sync James Morse
2017-10-05 19:17   ` James Morse
2017-10-05 19:18 ` [PATCH v3 08/20] arm64: entry.S: convert elX_irq James Morse
2017-10-05 19:18   ` James Morse
2017-10-11 17:13   ` Julien Thierry
2017-10-11 17:13     ` Julien Thierry
2017-10-12 12:26     ` James Morse
2017-10-12 12:26       ` James Morse
2017-10-05 19:18 ` [PATCH v3 09/20] KVM: arm/arm64: mask/unmask daif around VHE guests James Morse
2017-10-05 19:18   ` James Morse
2017-10-11  9:01   ` Marc Zyngier
2017-10-11  9:01     ` Marc Zyngier
2017-10-11 15:40     ` James Morse
2017-10-11 15:40       ` James Morse
2017-10-05 19:18 ` [PATCH v3 10/20] arm64: entry.S: move SError handling into a C function for future expansion James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 11/20] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 12/20] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 13/20] arm64: cpufeature: Enable IESB on exception entry/return for firmware-first James Morse
2017-10-05 19:18   ` James Morse
2017-10-18 16:43   ` Catalin Marinas
2017-10-18 16:43     ` Catalin Marinas
2017-10-18 17:14     ` James Morse
2017-10-18 17:14       ` James Morse
2017-10-05 19:18 ` [PATCH v3 14/20] arm64: kernel: Prepare for a DISR user James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-10-05 19:18   ` James Morse
2017-10-13  9:25   ` gengdongjiu
2017-10-13  9:25     ` gengdongjiu
2017-10-13 16:53     ` James Morse
2017-10-13 16:53       ` James Morse
2017-10-05 19:18 ` [PATCH v3 16/20] KVM: arm64: Save/Restore guest DISR_EL1 James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 17/20] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 18/20] KVM: arm64: Handle RAS SErrors from EL1 on guest exit James Morse
2017-10-05 19:18   ` James Morse
2017-10-05 19:18 ` [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 " James Morse
2017-10-05 19:18   ` James Morse
2017-10-11 10:37   ` Marc Zyngier
2017-10-11 10:37     ` Marc Zyngier
2017-10-12 12:28     ` James Morse
2017-10-12 12:28       ` James Morse
2017-10-05 19:18 ` [PATCH v3 20/20] KVM: arm64: Take any host SError before entering the guest James Morse
2017-10-05 19:18   ` James Morse
2017-10-18 16:55 ` [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support Catalin Marinas
2017-10-18 16:55   ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.