All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/13] arm64/KVM: RAS & IESB for firmware first support
@ 2018-01-15 19:38 ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

Hello,

The aim of this series is to enable IESB to let us kick any pending RAS
errors into firmware to be handled by firmware-first.

v6 is a rebase onto arm64's for-next/core branch to fix the conflicts.

I've picked up the Reviewed-by's from v4 that I missed on patches 7, 9 and 13,
(they have the relevant patchwork links). Please call me names if I missed any
more! (Dropped on patches 11 & 12 is deliberate due to handle_exit_early())


Not all systems will have firmware support, so these RAS errors will become
pending SErrors delivered to the kernel. The first part of the series adds
some crude categorization for SErrors into 'fatal' or ignorable. This stops us
panic()ing for corrected errors, but we make no attempt to handle the error.
Proper kernel-first support will be able to do a much better job here.

The second part of the series provides the same minimal handling for SError
that interrupt KVM. KVM is currently unable to handle SErrors during
world-switch, unless they occur during a magic single-instruction window,
it hyp-panics. I suspect this will be easier to fix once the VHE world-switch
is further optimised.

KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.

Until we have kernel-first support, containable RAS errors that interrupt a
guest are considered by KVM using the same crude categorization the arch code
uses. Fatal errors are treated as an impdef-SError, non-fatal errors are
ignored. Again, proper kernel-first support will do better.
(uncontained errors from a guest will always cause the host to panic)


Known issues:
 * Synchronous external abort SET severity is not yet considered, all
   synchronous-external-aborts are still considered fatal.

 * KVM-Migration: HCR_EL2.VSE and VSESR_EL2 cannot be migrated when the guest
   has an SError pending. An API using {G,S}ET_EVENTS is on my todo list.

 * KVM unmasks SError and IRQ before calling handle_exit_early, we may take
   interrupts while holding an uncontained ESR... (this is currently an
   improvement on assuming its an impdef error we can blame on the guest)
    * We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
      SError handling will need to move to before kvm_arm_vhe_guest_exit(),
      or at least into a region where SError and IRQ is still masked.

Thanks,

James


Dongjiu Geng (1):
  KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA

James Morse (11):
  arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  arm64: sysreg: Move to use definitions for all the SCTLR bits
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: Unconditionally enable IESB on exception entry/return for
    firmware-first
  arm64: kernel: Prepare for a DISR user
  KVM: arm/arm64: mask/unmask daif around VHE guests
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit

Xie XiuQi (1):
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm/include/asm/kvm_host.h      |  5 +++
 arch/arm64/Kconfig                   | 16 +++++++
 arch/arm64/include/asm/assembler.h   |  7 ++++
 arch/arm64/include/asm/cpucaps.h     |  3 +-
 arch/arm64/include/asm/esr.h         | 20 +++++++++
 arch/arm64/include/asm/exception.h   | 14 +++++++
 arch/arm64/include/asm/kvm_arm.h     |  2 +
 arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++
 arch/arm64/include/asm/kvm_host.h    | 17 ++++++++
 arch/arm64/include/asm/processor.h   |  1 +
 arch/arm64/include/asm/sysreg.h      | 81 +++++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/traps.h       | 54 ++++++++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/cpufeature.c       | 26 +++++++++++-
 arch/arm64/kernel/head.S             | 13 ++----
 arch/arm64/kernel/traps.c            | 51 ++++++++++++++++++++---
 arch/arm64/kvm/handle_exit.c         | 32 +++++++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 ++++++
 arch/arm64/kvm/hyp/switch.c          | 12 ++++--
 arch/arm64/kvm/hyp/sysreg-sr.c       |  6 +++
 arch/arm64/kvm/inject_fault.c        | 13 +++++-
 arch/arm64/kvm/sys_regs.c            | 11 +++++
 arch/arm64/mm/proc.S                 | 29 +++----------
 virt/kvm/arm/arm.c                   |  7 ++++
 24 files changed, 402 insertions(+), 49 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 00/13] arm64/KVM: RAS & IESB for firmware first support
@ 2018-01-15 19:38 ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

The aim of this series is to enable IESB to let us kick any pending RAS
errors into firmware to be handled by firmware-first.

v6 is a rebase onto arm64's for-next/core branch to fix the conflicts.

I've picked up the Reviewed-by's from v4 that I missed on patches 7, 9 and 13,
(they have the relevant patchwork links). Please call me names if I missed any
more! (Dropped on patches 11 & 12 is deliberate due to handle_exit_early())


Not all systems will have firmware support, so these RAS errors will become
pending SErrors delivered to the kernel. The first part of the series adds
some crude categorization for SErrors into 'fatal' or ignorable. This stops us
panic()ing for corrected errors, but we make no attempt to handle the error.
Proper kernel-first support will be able to do a much better job here.

The second part of the series provides the same minimal handling for SError
that interrupt KVM. KVM is currently unable to handle SErrors during
world-switch, unless they occur during a magic single-instruction window,
it hyp-panics. I suspect this will be easier to fix once the VHE world-switch
is further optimised.

KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.

Until we have kernel-first support, containable RAS errors that interrupt a
guest are considered by KVM using the same crude categorization the arch code
uses. Fatal errors are treated as an impdef-SError, non-fatal errors are
ignored. Again, proper kernel-first support will do better.
(uncontained errors from a guest will always cause the host to panic)


Known issues:
 * Synchronous external abort SET severity is not yet considered, all
   synchronous-external-aborts are still considered fatal.

 * KVM-Migration: HCR_EL2.VSE and VSESR_EL2 cannot be migrated when the guest
   has an SError pending. An API using {G,S}ET_EVENTS is on my todo list.

 * KVM unmasks SError and IRQ before calling handle_exit_early, we may take
   interrupts while holding an uncontained ESR... (this is currently an
   improvement on assuming its an impdef error we can blame on the guest)
    * We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
      SError handling will need to move to before kvm_arm_vhe_guest_exit(),
      or at least into a region where SError and IRQ is still masked.

Thanks,

James


Dongjiu Geng (1):
  KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA

James Morse (11):
  arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  arm64: sysreg: Move to use definitions for all the SCTLR bits
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: Unconditionally enable IESB on exception entry/return for
    firmware-first
  arm64: kernel: Prepare for a DISR user
  KVM: arm/arm64: mask/unmask daif around VHE guests
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit

Xie XiuQi (1):
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm/include/asm/kvm_host.h      |  5 +++
 arch/arm64/Kconfig                   | 16 +++++++
 arch/arm64/include/asm/assembler.h   |  7 ++++
 arch/arm64/include/asm/cpucaps.h     |  3 +-
 arch/arm64/include/asm/esr.h         | 20 +++++++++
 arch/arm64/include/asm/exception.h   | 14 +++++++
 arch/arm64/include/asm/kvm_arm.h     |  2 +
 arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++
 arch/arm64/include/asm/kvm_host.h    | 17 ++++++++
 arch/arm64/include/asm/processor.h   |  1 +
 arch/arm64/include/asm/sysreg.h      | 81 +++++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/traps.h       | 54 ++++++++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kernel/cpufeature.c       | 26 +++++++++++-
 arch/arm64/kernel/head.S             | 13 ++----
 arch/arm64/kernel/traps.c            | 51 ++++++++++++++++++++---
 arch/arm64/kvm/handle_exit.c         | 32 +++++++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 ++++++
 arch/arm64/kvm/hyp/switch.c          | 12 ++++--
 arch/arm64/kvm/hyp/sysreg-sr.c       |  6 +++
 arch/arm64/kvm/inject_fault.c        | 13 +++++-
 arch/arm64/kvm/sys_regs.c            | 11 +++++
 arch/arm64/mm/proc.S                 | 29 +++----------
 virt/kvm/arm/arm.c                   |  7 ++++
 24 files changed, 402 insertions(+), 49 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

this_cpu_has_cap() tests caps->desc not caps->matches, so it stops
walking the list when it finds a 'silent' feature, instead of
walking to the end of the list.

Prior to v4.6's 644c2ae198412 ("arm64: cpufeature: Test 'matches' pointer
to find the end of the list") we always tested desc to find the end of
a capability list. This was changed for dubious things like PAN_NOT_UAO.
v4.7's e3661b128e53e ("arm64: Allow a capability to be checked on
single CPU") added this_cpu_has_cap() using the old desc style test.

CC: Suzuki K Poulose <suzuki.poulose@arm.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
So far only ARM64_HAS_SYSREG_GIC_CPUIF and errata use this_cpu_has_cap(),
all the errata have descriptions, and the GIC_CPUIF feature is first in
the list, so its not possible to hit this with mainline. I don't think
this should go to stable - this is not intended as a fix.

 arch/arm64/kernel/cpufeature.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9ef84d0def9a..d88cd0e88606 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1303,8 +1303,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
 	if (WARN_ON(preemptible()))
 		return false;
 
-	for (caps = cap_array; caps->desc; caps++)
-		if (caps->capability == cap && caps->matches)
+	for (caps = cap_array; caps->matches; caps++)
+		if (caps->capability == cap)
 			return caps->matches(caps, SCOPE_LOCAL_CPU);
 
 	return false;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

this_cpu_has_cap() tests caps->desc not caps->matches, so it stops
walking the list when it finds a 'silent' feature, instead of
walking to the end of the list.

Prior to v4.6's 644c2ae198412 ("arm64: cpufeature: Test 'matches' pointer
to find the end of the list") we always tested desc to find the end of
a capability list. This was changed for dubious things like PAN_NOT_UAO.
v4.7's e3661b128e53e ("arm64: Allow a capability to be checked on
single CPU") added this_cpu_has_cap() using the old desc style test.

CC: Suzuki K Poulose <suzuki.poulose@arm.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
So far only ARM64_HAS_SYSREG_GIC_CPUIF and errata use this_cpu_has_cap(),
all the errata have descriptions, and the GIC_CPUIF feature is first in
the list, so its not possible to hit this with mainline. I don't think
this should go to stable - this is not intended as a fix.

 arch/arm64/kernel/cpufeature.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9ef84d0def9a..d88cd0e88606 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1303,8 +1303,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
 	if (WARN_ON(preemptible()))
 		return false;
 
-	for (caps = cap_array; caps->desc; caps++)
-		if (caps->capability == cap && caps->matches)
+	for (caps = cap_array; caps->matches; caps++)
+		if (caps->capability == cap)
 			return caps->matches(caps, SCOPE_LOCAL_CPU);
 
 	return false;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 02/13] arm64: sysreg: Move to use definitions for all the SCTLR bits
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

__cpu_setup() configures SCTLR_EL1 using some hard coded hex masks,
and el2_setup() duplicates some this when setting RES1 bits.

Lets make this the same as KVM's hyp_init, which uses named bits.

First, we add definitions for all the SCTLR_EL{1,2} bits, the RES{1,0}
bits, and those we want to set or clear.

Add a build_bug checks to ensures all bits are either set or clear.
This means we don't need to preserve endian-ness configuration
generated elsewhere.

Finally, move the head.S and proc.S users of these hard-coded masks
over to the macro versions.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 65 +++++++++++++++++++++++++++++++++++++++--
 arch/arm64/kernel/head.S        | 13 ++-------
 arch/arm64/mm/proc.S            | 24 +--------------
 3 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 54e99af043c6..1a8108f84932 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -20,6 +20,7 @@
 #ifndef __ASM_SYSREG_H
 #define __ASM_SYSREG_H
 
+#include <asm/compiler.h>
 #include <linux/stringify.h>
 
 /*
@@ -398,25 +399,81 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_WXN	(1 << 19)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
 #define SCTLR_ELx_C	(1 << 2)
 #define SCTLR_ELx_A	(1 << 1)
 #define SCTLR_ELx_M	1
 
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+/* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \
 			 (1 << 18) | (1 << 22) | (1 << 23) | (1 << 28) | \
 			 (1 << 29))
+#define SCTLR_EL2_RES0	((1 << 6)  | (1 << 7)  | (1 << 8)  | (1 << 9)  | \
+			 (1 << 10) | (1 << 13) | (1 << 14) | (1 << 15) | \
+			 (1 << 17) | (1 << 20) | (1 << 21) | (1 << 24) | \
+			 (1 << 26) | (1 << 27) | (1 << 30) | (1 << 31))
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL2		SCTLR_ELx_EE
+#define ENDIAN_CLEAR_EL2	0
+#else
+#define ENDIAN_SET_EL2		0
+#define ENDIAN_CLEAR_EL2	SCTLR_ELx_EE
+#endif
+
+/* SCTLR_EL2 value used for the hyp-stub */
+#define SCTLR_EL2_SET	(ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
+#define SCTLR_EL2_CLEAR	(SCTLR_ELx_M      | SCTLR_ELx_A    | SCTLR_ELx_C   | \
+			 SCTLR_ELx_SA     | SCTLR_ELx_I    | SCTLR_ELx_WXN | \
+			 ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL2_SET ^ SCTLR_EL2_CLEAR) != ~0)
 
-#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
-			 SCTLR_ELx_SA | SCTLR_ELx_I)
 
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
+#define SCTLR_EL1_E0E		(1 << 24)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_NTWE		(1 << 18)
+#define SCTLR_EL1_NTWI		(1 << 16)
 #define SCTLR_EL1_UCT		(1 << 15)
+#define SCTLR_EL1_DZE		(1 << 14)
+#define SCTLR_EL1_UMA		(1 << 9)
 #define SCTLR_EL1_SED		(1 << 8)
+#define SCTLR_EL1_ITD		(1 << 7)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
+#define SCTLR_EL1_SA0		(1 << 4)
+
+#define SCTLR_EL1_RES1	((1 << 11) | (1 << 20) | (1 << 22) | (1 << 28) | \
+			 (1 << 29))
+#define SCTLR_EL1_RES0  ((1 << 6)  | (1 << 10) | (1 << 13) | (1 << 17) | \
+			 (1 << 21) | (1 << 27) | (1 << 30) | (1 << 31))
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#define ENDIAN_CLEAR_EL1	0
+#else
+#define ENDIAN_SET_EL1		0
+#define ENDIAN_CLEAR_EL1	(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#endif
+
+#define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
+			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
+			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
+			 SCTLR_EL1_NTWE | SCTLR_EL1_SPAN | ENDIAN_SET_EL1 |\
+			 SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
+#define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
+			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
+			 SCTLR_EL1_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != ~0)
 
 /* id_aa64isar0 */
 #define ID_AA64ISAR0_FHM_SHIFT		48
@@ -593,6 +650,7 @@
 
 #else
 
+#include <linux/build_bug.h>
 #include <linux/types.h>
 
 asm(
@@ -649,6 +707,9 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
 {
 	u32 val;
 
+	SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS;
+	SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS;
+
 	val = read_sysreg(sctlr_el1);
 	val &= ~clear;
 	val |= set;
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 95748a00eb89..c3b241b8b659 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -492,17 +492,13 @@ ENTRY(el2_setup)
 	mrs	x0, CurrentEL
 	cmp	x0, #CurrentEL_EL2
 	b.eq	1f
-	mrs	x0, sctlr_el1
-CPU_BE(	orr	x0, x0, #(3 << 24)	)	// Set the EE and E0E bits for EL1
-CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 	mov	w0, #BOOT_CPU_MODE_EL1		// This cpu booted in EL1
 	isb
 	ret
 
-1:	mrs	x0, sctlr_el2
-CPU_BE(	orr	x0, x0, #(1 << 25)	)	// Set the EE bit for EL2
-CPU_LE(	bic	x0, x0, #(1 << 25)	)	// Clear the EE bit for EL2
+1:	mov_q	x0, (SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
 	msr	sctlr_el2, x0
 
 #ifdef CONFIG_ARM64_VHE
@@ -618,10 +614,7 @@ install_el2_stub:
 	 * requires no configuration, and all non-hyp-specific EL2 setup
 	 * will be done via the _EL1 system register aliases in __cpu_setup.
 	 */
-	/* sctlr_el1 */
-	mov	x0, #0x0800			// Set/clear RES{1,0} bits
-CPU_BE(	movk	x0, #0x33d0, lsl #16	)	// Set EE and E0E on BE systems
-CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 
 	/* Coprocessor traps. */
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5a59eea49395..ec4c3d82b4d6 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -226,11 +226,7 @@ ENTRY(__cpu_setup)
 	/*
 	 * Prepare SCTLR
 	 */
-	adr	x5, crval
-	ldp	w5, w6, [x5]
-	mrs	x0, sctlr_el1
-	bic	x0, x0, x5			// clear bits
-	orr	x0, x0, x6			// set bits
+	mov_q	x0, SCTLR_EL1_SET
 	/*
 	 * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
 	 * both user and kernel.
@@ -259,21 +255,3 @@ ENTRY(__cpu_setup)
 	msr	tcr_el1, x10
 	ret					// return to head.S
 ENDPROC(__cpu_setup)
-
-	/*
-	 * We set the desired value explicitly, including those of the
-	 * reserved bits. The values of bits EE & E0E were set early in
-	 * el2_setup, which are left untouched below.
-	 *
-	 *                 n n            T
-	 *       U E      WT T UD     US IHBS
-	 *       CE0      XWHW CZ     ME TEEA S
-	 * .... .IEE .... NEAI TE.I ..AD DEN0 ACAM
-	 * 0011 0... 1101 ..0. ..0. 10.. .0.. .... < hardware reserved
-	 * .... .1.. .... 01.1 11.1 ..01 0.01 1101 < software settings
-	 */
-	.type	crval, #object
-crval:
-	.word	0xfcffffff			// clear
-	.word	0x34d5d91d			// set
-	.popsection
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 02/13] arm64: sysreg: Move to use definitions for all the SCTLR bits
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

__cpu_setup() configures SCTLR_EL1 using some hard coded hex masks,
and el2_setup() duplicates some this when setting RES1 bits.

Lets make this the same as KVM's hyp_init, which uses named bits.

First, we add definitions for all the SCTLR_EL{1,2} bits, the RES{1,0}
bits, and those we want to set or clear.

Add a build_bug checks to ensures all bits are either set or clear.
This means we don't need to preserve endian-ness configuration
generated elsewhere.

Finally, move the head.S and proc.S users of these hard-coded masks
over to the macro versions.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 65 +++++++++++++++++++++++++++++++++++++++--
 arch/arm64/kernel/head.S        | 13 ++-------
 arch/arm64/mm/proc.S            | 24 +--------------
 3 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 54e99af043c6..1a8108f84932 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -20,6 +20,7 @@
 #ifndef __ASM_SYSREG_H
 #define __ASM_SYSREG_H
 
+#include <asm/compiler.h>
 #include <linux/stringify.h>
 
 /*
@@ -398,25 +399,81 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_WXN	(1 << 19)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
 #define SCTLR_ELx_C	(1 << 2)
 #define SCTLR_ELx_A	(1 << 1)
 #define SCTLR_ELx_M	1
 
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+/* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \
 			 (1 << 18) | (1 << 22) | (1 << 23) | (1 << 28) | \
 			 (1 << 29))
+#define SCTLR_EL2_RES0	((1 << 6)  | (1 << 7)  | (1 << 8)  | (1 << 9)  | \
+			 (1 << 10) | (1 << 13) | (1 << 14) | (1 << 15) | \
+			 (1 << 17) | (1 << 20) | (1 << 21) | (1 << 24) | \
+			 (1 << 26) | (1 << 27) | (1 << 30) | (1 << 31))
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL2		SCTLR_ELx_EE
+#define ENDIAN_CLEAR_EL2	0
+#else
+#define ENDIAN_SET_EL2		0
+#define ENDIAN_CLEAR_EL2	SCTLR_ELx_EE
+#endif
+
+/* SCTLR_EL2 value used for the hyp-stub */
+#define SCTLR_EL2_SET	(ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
+#define SCTLR_EL2_CLEAR	(SCTLR_ELx_M      | SCTLR_ELx_A    | SCTLR_ELx_C   | \
+			 SCTLR_ELx_SA     | SCTLR_ELx_I    | SCTLR_ELx_WXN | \
+			 ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL2_SET ^ SCTLR_EL2_CLEAR) != ~0)
 
-#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
-			 SCTLR_ELx_SA | SCTLR_ELx_I)
 
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
+#define SCTLR_EL1_E0E		(1 << 24)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_NTWE		(1 << 18)
+#define SCTLR_EL1_NTWI		(1 << 16)
 #define SCTLR_EL1_UCT		(1 << 15)
+#define SCTLR_EL1_DZE		(1 << 14)
+#define SCTLR_EL1_UMA		(1 << 9)
 #define SCTLR_EL1_SED		(1 << 8)
+#define SCTLR_EL1_ITD		(1 << 7)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
+#define SCTLR_EL1_SA0		(1 << 4)
+
+#define SCTLR_EL1_RES1	((1 << 11) | (1 << 20) | (1 << 22) | (1 << 28) | \
+			 (1 << 29))
+#define SCTLR_EL1_RES0  ((1 << 6)  | (1 << 10) | (1 << 13) | (1 << 17) | \
+			 (1 << 21) | (1 << 27) | (1 << 30) | (1 << 31))
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#define ENDIAN_CLEAR_EL1	0
+#else
+#define ENDIAN_SET_EL1		0
+#define ENDIAN_CLEAR_EL1	(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#endif
+
+#define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
+			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
+			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
+			 SCTLR_EL1_NTWE | SCTLR_EL1_SPAN | ENDIAN_SET_EL1 |\
+			 SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
+#define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
+			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
+			 SCTLR_EL1_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != ~0)
 
 /* id_aa64isar0 */
 #define ID_AA64ISAR0_FHM_SHIFT		48
@@ -593,6 +650,7 @@
 
 #else
 
+#include <linux/build_bug.h>
 #include <linux/types.h>
 
 asm(
@@ -649,6 +707,9 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
 {
 	u32 val;
 
+	SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS;
+	SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS;
+
 	val = read_sysreg(sctlr_el1);
 	val &= ~clear;
 	val |= set;
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 95748a00eb89..c3b241b8b659 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -492,17 +492,13 @@ ENTRY(el2_setup)
 	mrs	x0, CurrentEL
 	cmp	x0, #CurrentEL_EL2
 	b.eq	1f
-	mrs	x0, sctlr_el1
-CPU_BE(	orr	x0, x0, #(3 << 24)	)	// Set the EE and E0E bits for EL1
-CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 	mov	w0, #BOOT_CPU_MODE_EL1		// This cpu booted in EL1
 	isb
 	ret
 
-1:	mrs	x0, sctlr_el2
-CPU_BE(	orr	x0, x0, #(1 << 25)	)	// Set the EE bit for EL2
-CPU_LE(	bic	x0, x0, #(1 << 25)	)	// Clear the EE bit for EL2
+1:	mov_q	x0, (SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
 	msr	sctlr_el2, x0
 
 #ifdef CONFIG_ARM64_VHE
@@ -618,10 +614,7 @@ install_el2_stub:
 	 * requires no configuration, and all non-hyp-specific EL2 setup
 	 * will be done via the _EL1 system register aliases in __cpu_setup.
 	 */
-	/* sctlr_el1 */
-	mov	x0, #0x0800			// Set/clear RES{1,0} bits
-CPU_BE(	movk	x0, #0x33d0, lsl #16	)	// Set EE and E0E on BE systems
-CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 
 	/* Coprocessor traps. */
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5a59eea49395..ec4c3d82b4d6 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -226,11 +226,7 @@ ENTRY(__cpu_setup)
 	/*
 	 * Prepare SCTLR
 	 */
-	adr	x5, crval
-	ldp	w5, w6, [x5]
-	mrs	x0, sctlr_el1
-	bic	x0, x0, x5			// clear bits
-	orr	x0, x0, x6			// set bits
+	mov_q	x0, SCTLR_EL1_SET
 	/*
 	 * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
 	 * both user and kernel.
@@ -259,21 +255,3 @@ ENTRY(__cpu_setup)
 	msr	tcr_el1, x10
 	ret					// return to head.S
 ENDPROC(__cpu_setup)
-
-	/*
-	 * We set the desired value explicitly, including those of the
-	 * reserved bits. The values of bits EE & E0E were set early in
-	 * el2_setup, which are left untouched below.
-	 *
-	 *                 n n            T
-	 *       U E      WT T UD     US IHBS
-	 *       CE0      XWHW CZ     ME TEEA S
-	 * .... .IEE .... NEAI TE.I ..AD DEN0 ACAM
-	 * 0011 0... 1101 ..0. ..0. 10.. .0.. .... < hardware reserved
-	 * .... .1.. .... 01.1 11.1 ..01 0.01 1101 < software settings
-	 */
-	.type	crval, #object
-crval:
-	.word	0xfcffffff			// clear
-	.word	0x34d5d91d			// set
-	.popsection
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending. Add cpufeature detection.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Removed barrier in context switch

 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 664fadc2aa2e..1d51c8edf34b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1062,6 +1062,22 @@ config ARM64_PMEM
 	  operations if DC CVAP is not supported (following the behaviour of
 	  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_SVE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 7049b4802587..bb263820de13 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -44,7 +44,8 @@
 #define ARM64_UNMAP_KERNEL_AT_EL0		23
 #define ARM64_HARDEN_BRANCH_PREDICTOR		24
 #define ARM64_HARDEN_BP_POST_GUEST_EXIT		25
+#define ARM64_HAS_RAS_EXTN			26
 
-#define ARM64_NCAPS				26
+#define ARM64_NCAPS				27
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1a8108f84932..321622e9f9c3 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -498,6 +498,7 @@
 #define ID_AA64PFR0_CSV3_SHIFT		60
 #define ID_AA64PFR0_CSV2_SHIFT		56
 #define ID_AA64PFR0_SVE_SHIFT		32
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -507,6 +508,7 @@
 #define ID_AA64PFR0_EL0_SHIFT		0
 
 #define ID_AA64PFR0_SVE			0x1
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d88cd0e88606..0c0af18121e1 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -149,6 +149,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -1028,6 +1029,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.enable = sve_kernel_enable,
 	},
 #endif /* CONFIG_ARM64_SVE */
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending. Add cpufeature detection.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Removed barrier in context switch

 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 664fadc2aa2e..1d51c8edf34b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1062,6 +1062,22 @@ config ARM64_PMEM
 	  operations if DC CVAP is not supported (following the behaviour of
 	  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_SVE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 7049b4802587..bb263820de13 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -44,7 +44,8 @@
 #define ARM64_UNMAP_KERNEL_AT_EL0		23
 #define ARM64_HARDEN_BRANCH_PREDICTOR		24
 #define ARM64_HARDEN_BP_POST_GUEST_EXIT		25
+#define ARM64_HAS_RAS_EXTN			26
 
-#define ARM64_NCAPS				26
+#define ARM64_NCAPS				27
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1a8108f84932..321622e9f9c3 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -498,6 +498,7 @@
 #define ID_AA64PFR0_CSV3_SHIFT		60
 #define ID_AA64PFR0_CSV2_SHIFT		56
 #define ID_AA64PFR0_SVE_SHIFT		32
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -507,6 +508,7 @@
 #define ID_AA64PFR0_EL0_SHIFT		0
 
 #define ID_AA64PFR0_SVE			0x1
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d88cd0e88606..0c0af18121e1 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -149,6 +149,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -1028,6 +1029,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.enable = sve_kernel_enable,
 	},
 #endif /* CONFIG_ARM64_SVE */
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 04/13] arm64: kernel: Survive corrected RAS errors notified by SError
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the Error Syncronization Barrier.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as a
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error. Future patches may add this kernel-first handling.

Without support for either of these we will panic(), even if we received
a corrected error. Add code to decode the severity of RAS errors. We can
safely ignore contained errors where the CPU can continue to make
progress. For all other errors we continue to panic().

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Reworded 'blocking' to 'fatal'.
 * Added AET_SHIFT define
 * Check local CPU caps to allow survivable contained SError on systems where
   some CPUs have RAS support and some don't.
 * Clarified where we are merging uncategorized and uncontained. Mostly
   comments and swapping '0' for a macro defined as 0.
   Note: These two share an encoding on aarch32.
 * Removed preempt bodge, KVM now calls this from a non-preemtible location.

(all of which I think are minor)


 arch/arm64/include/asm/esr.h   | 13 ++++++++++
 arch/arm64/include/asm/traps.h | 54 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c      | 51 +++++++++++++++++++++++++++++++++++----
 3 files changed, 113 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 014d7d8edcf9..c367838700fa 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -86,6 +86,18 @@
 #define ESR_ELx_WNR_SHIFT	(6)
 #define ESR_ELx_WNR		(UL(1) << ESR_ELx_WNR_SHIFT)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_IDS_SHIFT	(24)
+#define ESR_ELx_IDS		(UL(1) << ESR_ELx_IDS_SHIFT)
+#define ESR_ELx_AET_SHIFT	(10)
+#define ESR_ELx_AET		(UL(0x7) << ESR_ELx_AET_SHIFT)
+
+#define ESR_ELx_AET_UC		(UL(0) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEU		(UL(1) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEO		(UL(2) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UER		(UL(3) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_CE		(UL(6) << ESR_ELx_AET_SHIFT)
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_SET_SHIFT	(11)
 #define ESR_ELx_SET_MASK	(UL(3) << ESR_ELx_SET_SHIFT)
@@ -100,6 +112,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index 1696f9de9359..178e338d2889 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -19,6 +19,7 @@
 #define __ASM_TRAP_H
 
 #include <linux/list.h>
+#include <asm/esr.h>
 #include <asm/sections.h>
 
 struct pt_regs;
@@ -66,4 +67,57 @@ static inline int in_entry_text(unsigned long ptr)
 	return ptr >= (unsigned long)&__entry_text_start &&
 	       ptr < (unsigned long)&__entry_text_end;
 }
+
+/*
+ * CPUs with the RAS extensions have an Implementation-Defined-Syndrome bit
+ * to indicate whether this ESR has a RAS encoding. CPUs without this feature
+ * have a ISS-Valid bit in the same position.
+ * If this bit is set, we know its not a RAS SError.
+ * If its clear, we need to know if the CPU supports RAS. Uncategorized RAS
+ * errors share the same encoding as an all-zeros encoding from a CPU that
+ * doesn't support RAS.
+ */
+static inline bool arm64_is_ras_serror(u32 esr)
+{
+	WARN_ON(preemptible());
+
+	if (esr & ESR_ELx_IDS)
+		return false;
+
+	if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN))
+		return true;
+	else
+		return false;
+}
+
+/*
+ * Return the AET bits from a RAS SError's ESR.
+ *
+ * It is implementation defined whether Uncategorized errors are containable.
+ * We treat them as Uncontainable.
+ * Non-RAS SError's are reported as Uncontained/Uncategorized.
+ */
+static inline u32 arm64_ras_serror_get_severity(u32 esr)
+{
+	u32 aet = esr & ESR_ELx_AET;
+
+	if (!arm64_is_ras_serror(esr)) {
+		/* Not a RAS error, we can't interpret the ESR. */
+		return ESR_ELx_AET_UC;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		/* No severity information : Uncategorized */
+		return ESR_ELx_AET_UC;
+	}
+
+	return aet;
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr);
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr);
 #endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 3d3588fcd1c7..bbb0fde2780e 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -662,17 +662,58 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
 		smp_processor_id(), esr, esr_get_class_string(esr));
-	__show_regs(regs);
+	if (regs)
+		__show_regs(regs);
+
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	cpu_park_loop();
+	unreachable();
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)
+{
+	u32 aet = arm64_ras_serror_get_severity(esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		/*
+		 * The CPU can make progress. We may take UEO again as
+		 * a more severe error.
+		 */
+		return false;
+
+	case ESR_ELx_AET_UEU:	/* Uncorrected Unrecoverable */
+	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
+		/*
+		 * The CPU can't make progress. The exception may have
+		 * been imprecise.
+		 */
+		return true;
+
+	case ESR_ELx_AET_UC:	/* Uncontainable or Uncategorized error */
+	default:
+		/* Error has been silently propagated */
+		arm64_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	/* non-RAS errors are not containable */
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
+		arm64_serror_panic(regs, esr);
 
-	panic("Asynchronous SError Interrupt");
+	nmi_exit();
 }
 
 void __pte_error(const char *file, int line, unsigned long val)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 04/13] arm64: kernel: Survive corrected RAS errors notified by SError
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the Error Syncronization Barrier.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as a
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error. Future patches may add this kernel-first handling.

Without support for either of these we will panic(), even if we received
a corrected error. Add code to decode the severity of RAS errors. We can
safely ignore contained errors where the CPU can continue to make
progress. For all other errors we continue to panic().

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Reworded 'blocking' to 'fatal'.
 * Added AET_SHIFT define
 * Check local CPU caps to allow survivable contained SError on systems where
   some CPUs have RAS support and some don't.
 * Clarified where we are merging uncategorized and uncontained. Mostly
   comments and swapping '0' for a macro defined as 0.
   Note: These two share an encoding on aarch32.
 * Removed preempt bodge, KVM now calls this from a non-preemtible location.

(all of which I think are minor)


 arch/arm64/include/asm/esr.h   | 13 ++++++++++
 arch/arm64/include/asm/traps.h | 54 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c      | 51 +++++++++++++++++++++++++++++++++++----
 3 files changed, 113 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 014d7d8edcf9..c367838700fa 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -86,6 +86,18 @@
 #define ESR_ELx_WNR_SHIFT	(6)
 #define ESR_ELx_WNR		(UL(1) << ESR_ELx_WNR_SHIFT)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_IDS_SHIFT	(24)
+#define ESR_ELx_IDS		(UL(1) << ESR_ELx_IDS_SHIFT)
+#define ESR_ELx_AET_SHIFT	(10)
+#define ESR_ELx_AET		(UL(0x7) << ESR_ELx_AET_SHIFT)
+
+#define ESR_ELx_AET_UC		(UL(0) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEU		(UL(1) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEO		(UL(2) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UER		(UL(3) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_CE		(UL(6) << ESR_ELx_AET_SHIFT)
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_SET_SHIFT	(11)
 #define ESR_ELx_SET_MASK	(UL(3) << ESR_ELx_SET_SHIFT)
@@ -100,6 +112,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index 1696f9de9359..178e338d2889 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -19,6 +19,7 @@
 #define __ASM_TRAP_H
 
 #include <linux/list.h>
+#include <asm/esr.h>
 #include <asm/sections.h>
 
 struct pt_regs;
@@ -66,4 +67,57 @@ static inline int in_entry_text(unsigned long ptr)
 	return ptr >= (unsigned long)&__entry_text_start &&
 	       ptr < (unsigned long)&__entry_text_end;
 }
+
+/*
+ * CPUs with the RAS extensions have an Implementation-Defined-Syndrome bit
+ * to indicate whether this ESR has a RAS encoding. CPUs without this feature
+ * have a ISS-Valid bit in the same position.
+ * If this bit is set, we know its not a RAS SError.
+ * If its clear, we need to know if the CPU supports RAS. Uncategorized RAS
+ * errors share the same encoding as an all-zeros encoding from a CPU that
+ * doesn't support RAS.
+ */
+static inline bool arm64_is_ras_serror(u32 esr)
+{
+	WARN_ON(preemptible());
+
+	if (esr & ESR_ELx_IDS)
+		return false;
+
+	if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN))
+		return true;
+	else
+		return false;
+}
+
+/*
+ * Return the AET bits from a RAS SError's ESR.
+ *
+ * It is implementation defined whether Uncategorized errors are containable.
+ * We treat them as Uncontainable.
+ * Non-RAS SError's are reported as Uncontained/Uncategorized.
+ */
+static inline u32 arm64_ras_serror_get_severity(u32 esr)
+{
+	u32 aet = esr & ESR_ELx_AET;
+
+	if (!arm64_is_ras_serror(esr)) {
+		/* Not a RAS error, we can't interpret the ESR. */
+		return ESR_ELx_AET_UC;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		/* No severity information : Uncategorized */
+		return ESR_ELx_AET_UC;
+	}
+
+	return aet;
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr);
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr);
 #endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 3d3588fcd1c7..bbb0fde2780e 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -662,17 +662,58 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
 		smp_processor_id(), esr, esr_get_class_string(esr));
-	__show_regs(regs);
+	if (regs)
+		__show_regs(regs);
+
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	cpu_park_loop();
+	unreachable();
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)
+{
+	u32 aet = arm64_ras_serror_get_severity(esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		/*
+		 * The CPU can make progress. We may take UEO again as
+		 * a more severe error.
+		 */
+		return false;
+
+	case ESR_ELx_AET_UEU:	/* Uncorrected Unrecoverable */
+	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
+		/*
+		 * The CPU can't make progress. The exception may have
+		 * been imprecise.
+		 */
+		return true;
+
+	case ESR_ELx_AET_UC:	/* Uncontainable or Uncategorized error */
+	default:
+		/* Error has been silently propagated */
+		arm64_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	/* non-RAS errors are not containable */
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
+		arm64_serror_panic(regs, esr);
 
-	panic("Asynchronous SError Interrupt");
+	nmi_exit();
 }
 
 void __pte_error(const char *file, int line, unsigned long val)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 05/13] arm64: Unconditionally enable IESB on exception entry/return for firmware-first
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

ARM v8.2 has a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from an exception level. Add this to the
features we always enable. CPUs that don't support this feature will treat
the bit as RES0.

This feature causes RAS errors that are not yet visible to software to
become pending SErrors. We expect to have firmware-first RAS support
so synchronised RAS errors will be take immediately to EL3.
Any system without firmware-first handling of errors will take the SError
either immediatly after exception return, or when we unmask SError after
entry.S's work.

Adding IESB to the ELx flags causes it to be enabled by KVM and kexec
too.

Platform level RAS support may require additional firmware support.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Link: https://www.spinics.net/lists/kvm-arm/msg28192.html
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Unconditionally enabled in SCTLR_ELx, (so a rewrite)
 * Dropped Catalin's Reviewed-by.

Changes since v3:
 * removed IESB Kconfig option

 arch/arm64/include/asm/sysreg.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 321622e9f9c3..1281bc8263c2 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -399,6 +399,7 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_IESB	(1 << 21)
 #define SCTLR_ELx_WXN	(1 << 19)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
@@ -406,8 +407,8 @@
 #define SCTLR_ELx_A	(1 << 1)
 #define SCTLR_ELx_M	1
 
-#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
-			 SCTLR_ELx_SA | SCTLR_ELx_I)
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M  | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_IESB)
 
 /* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \
@@ -415,8 +416,8 @@
 			 (1 << 29))
 #define SCTLR_EL2_RES0	((1 << 6)  | (1 << 7)  | (1 << 8)  | (1 << 9)  | \
 			 (1 << 10) | (1 << 13) | (1 << 14) | (1 << 15) | \
-			 (1 << 17) | (1 << 20) | (1 << 21) | (1 << 24) | \
-			 (1 << 26) | (1 << 27) | (1 << 30) | (1 << 31))
+			 (1 << 17) | (1 << 20) | (1 << 24) | (1 << 26) | \
+			 (1 << 27) | (1 << 30) | (1 << 31))
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define ENDIAN_SET_EL2		SCTLR_ELx_EE
@@ -427,7 +428,7 @@
 #endif
 
 /* SCTLR_EL2 value used for the hyp-stub */
-#define SCTLR_EL2_SET	(ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
+#define SCTLR_EL2_SET	(SCTLR_ELx_IESB   | ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
 #define SCTLR_EL2_CLEAR	(SCTLR_ELx_M      | SCTLR_ELx_A    | SCTLR_ELx_C   | \
 			 SCTLR_ELx_SA     | SCTLR_ELx_I    | SCTLR_ELx_WXN | \
 			 ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
@@ -453,7 +454,7 @@
 #define SCTLR_EL1_RES1	((1 << 11) | (1 << 20) | (1 << 22) | (1 << 28) | \
 			 (1 << 29))
 #define SCTLR_EL1_RES0  ((1 << 6)  | (1 << 10) | (1 << 13) | (1 << 17) | \
-			 (1 << 21) | (1 << 27) | (1 << 30) | (1 << 31))
+			 (1 << 27) | (1 << 30) | (1 << 31))
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
@@ -466,8 +467,8 @@
 #define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
 			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
-			 SCTLR_EL1_NTWE | SCTLR_EL1_SPAN | ENDIAN_SET_EL1 |\
-			 SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
+			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 #define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
 			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
 			 SCTLR_EL1_RES0)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 05/13] arm64: Unconditionally enable IESB on exception entry/return for firmware-first
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

ARM v8.2 has a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from an exception level. Add this to the
features we always enable. CPUs that don't support this feature will treat
the bit as RES0.

This feature causes RAS errors that are not yet visible to software to
become pending SErrors. We expect to have firmware-first RAS support
so synchronised RAS errors will be take immediately to EL3.
Any system without firmware-first handling of errors will take the SError
either immediatly after exception return, or when we unmask SError after
entry.S's work.

Adding IESB to the ELx flags causes it to be enabled by KVM and kexec
too.

Platform level RAS support may require additional firmware support.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Link: https://www.spinics.net/lists/kvm-arm/msg28192.html
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Unconditionally enabled in SCTLR_ELx, (so a rewrite)
 * Dropped Catalin's Reviewed-by.

Changes since v3:
 * removed IESB Kconfig option

 arch/arm64/include/asm/sysreg.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 321622e9f9c3..1281bc8263c2 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -399,6 +399,7 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_IESB	(1 << 21)
 #define SCTLR_ELx_WXN	(1 << 19)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
@@ -406,8 +407,8 @@
 #define SCTLR_ELx_A	(1 << 1)
 #define SCTLR_ELx_M	1
 
-#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
-			 SCTLR_ELx_SA | SCTLR_ELx_I)
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M  | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_IESB)
 
 /* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \
@@ -415,8 +416,8 @@
 			 (1 << 29))
 #define SCTLR_EL2_RES0	((1 << 6)  | (1 << 7)  | (1 << 8)  | (1 << 9)  | \
 			 (1 << 10) | (1 << 13) | (1 << 14) | (1 << 15) | \
-			 (1 << 17) | (1 << 20) | (1 << 21) | (1 << 24) | \
-			 (1 << 26) | (1 << 27) | (1 << 30) | (1 << 31))
+			 (1 << 17) | (1 << 20) | (1 << 24) | (1 << 26) | \
+			 (1 << 27) | (1 << 30) | (1 << 31))
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define ENDIAN_SET_EL2		SCTLR_ELx_EE
@@ -427,7 +428,7 @@
 #endif
 
 /* SCTLR_EL2 value used for the hyp-stub */
-#define SCTLR_EL2_SET	(ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
+#define SCTLR_EL2_SET	(SCTLR_ELx_IESB   | ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
 #define SCTLR_EL2_CLEAR	(SCTLR_ELx_M      | SCTLR_ELx_A    | SCTLR_ELx_C   | \
 			 SCTLR_ELx_SA     | SCTLR_ELx_I    | SCTLR_ELx_WXN | \
 			 ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
@@ -453,7 +454,7 @@
 #define SCTLR_EL1_RES1	((1 << 11) | (1 << 20) | (1 << 22) | (1 << 28) | \
 			 (1 << 29))
 #define SCTLR_EL1_RES0  ((1 << 6)  | (1 << 10) | (1 << 13) | (1 << 17) | \
-			 (1 << 21) | (1 << 27) | (1 << 30) | (1 << 31))
+			 (1 << 27) | (1 << 30) | (1 << 31))
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
@@ -466,8 +467,8 @@
 #define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
 			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
-			 SCTLR_EL1_NTWE | SCTLR_EL1_SPAN | ENDIAN_SET_EL1 |\
-			 SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
+			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 #define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
 			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
 			 SCTLR_EL1_RES0)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 06/13] arm64: kernel: Prepare for a DISR user
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:38   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

KVM would like to consume any pending SError (or RAS error) after guest
exit. Today it has to unmask SError and use dsb+isb to synchronise the
CPU. With the RAS extensions we can use ESB to synchronise any pending
SError.

Add the necessary macros to allow DISR to be read and converted to an
ESR.

We clear the DISR register when we enable the RAS cpufeature, and the
kernel has not executed any ESB instructions. Any value we find in DISR
must have belonged to firmware. Executing an ESB instruction is the
only way to update DISR, so we can expect firmware to have handled
any deferred SError. By the same logic we clear DISR in the idle path.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Corrected 'we ran ESB()' in the commit message.

 arch/arm64/include/asm/assembler.h |  7 +++++++
 arch/arm64/include/asm/esr.h       |  7 +++++++
 arch/arm64/include/asm/exception.h | 14 ++++++++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++++++
 arch/arm64/mm/proc.S               |  5 +++++
 7 files changed, 44 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 5dc4856f3bb9..40e506b32cb6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -108,6 +108,13 @@
 	dmb	\opt
 	.endm
 
+/*
+ * RAS Error Synchronization barrier
+ */
+	.macro  esb
+	hint    #16
+	.endm
+
 /*
  * NOP sequence
  */
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index c367838700fa..803443d74926 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -140,6 +140,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		(UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 023cacb946c3..cee4ae25a5d1 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -216,6 +216,7 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1281bc8263c2..115b89aeec00 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -279,6 +279,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0c0af18121e1..ae5433005504 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1039,6 +1039,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
@@ -1466,3 +1467,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 core_initcall(enable_mrs_emulation);
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ec4c3d82b4d6..05e4ae934b23 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -132,6 +132,11 @@ alternative_endif
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+
+alternative_if ARM64_HAS_RAS_EXTN
+	msr_s	SYS_DISR_EL1, xzr
+alternative_else_nop_endif
+
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 06/13] arm64: kernel: Prepare for a DISR user
@ 2018-01-15 19:38   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

KVM would like to consume any pending SError (or RAS error) after guest
exit. Today it has to unmask SError and use dsb+isb to synchronise the
CPU. With the RAS extensions we can use ESB to synchronise any pending
SError.

Add the necessary macros to allow DISR to be read and converted to an
ESR.

We clear the DISR register when we enable the RAS cpufeature, and the
kernel has not executed any ESB instructions. Any value we find in DISR
must have belonged to firmware. Executing an ESB instruction is the
only way to update DISR, so we can expect firmware to have handled
any deferred SError. By the same logic we clear DISR in the idle path.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
Changes since v4:
 * Corrected 'we ran ESB()' in the commit message.

 arch/arm64/include/asm/assembler.h |  7 +++++++
 arch/arm64/include/asm/esr.h       |  7 +++++++
 arch/arm64/include/asm/exception.h | 14 ++++++++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++++++
 arch/arm64/mm/proc.S               |  5 +++++
 7 files changed, 44 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 5dc4856f3bb9..40e506b32cb6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -108,6 +108,13 @@
 	dmb	\opt
 	.endm
 
+/*
+ * RAS Error Synchronization barrier
+ */
+	.macro  esb
+	hint    #16
+	.endm
+
 /*
  * NOP sequence
  */
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index c367838700fa..803443d74926 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -140,6 +140,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		(UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 023cacb946c3..cee4ae25a5d1 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -216,6 +216,7 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1281bc8263c2..115b89aeec00 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -279,6 +279,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0c0af18121e1..ae5433005504 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1039,6 +1039,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
@@ -1466,3 +1467,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 core_initcall(enable_mrs_emulation);
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ec4c3d82b4d6..05e4ae934b23 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -132,6 +132,11 @@ alternative_endif
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+
+alternative_if ARM64_HAS_RAS_EXTN
+	msr_s	SYS_DISR_EL1, xzr
+alternative_else_nop_endif
+
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 07/13] KVM: arm/arm64: mask/unmask daif around VHE guests
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
This isn't backportable because of the 'daif' helpers, I will produce a
backport once its merged.

Changes since v4:
 * Added empty declarations for 32bit. (how did I miss that?)

v5 of this patch missed a Reviewed-by, which comes from here:
https://patchwork.kernel.org/patch/10017467/

 arch/arm/include/asm/kvm_host.h   |  2 ++
 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 virt/kvm/arm/arm.c                |  4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index a9f7d3f47134..b86fc4162539 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -301,4 +301,6 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
 static inline void kvm_fpsimd_flush_cpu_state(void) {}
 
+static inline void kvm_arm_vhe_guest_enter(void) {}
+static inline void kvm_arm_vhe_guest_exit(void) {}
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7ee72b402907..dcdd08edf5a5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -398,4 +399,13 @@ static inline void kvm_fpsimd_flush_cpu_state(void)
 		sve_flush_cpu_state();
 }
 
+static inline void kvm_arm_vhe_guest_enter(void)
+{
+	local_daif_mask();
+}
+
+static inline void kvm_arm_vhe_guest_exit(void)
+{
+	local_daif_restore(DAIF_PROCCTX_NOIRQ);
+}
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2fc6009a766c..38e81631fc91 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -704,9 +704,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_enter_irqoff();
+		if (has_vhe())
+			kvm_arm_vhe_guest_enter();
 
 		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
+		if (has_vhe())
+			kvm_arm_vhe_guest_exit();
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
 		/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 07/13] KVM: arm/arm64: mask/unmask daif around VHE guests
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
This isn't backportable because of the 'daif' helpers, I will produce a
backport once its merged.

Changes since v4:
 * Added empty declarations for 32bit. (how did I miss that?)

v5 of this patch missed a Reviewed-by, which comes from here:
https://patchwork.kernel.org/patch/10017467/

 arch/arm/include/asm/kvm_host.h   |  2 ++
 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 virt/kvm/arm/arm.c                |  4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index a9f7d3f47134..b86fc4162539 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -301,4 +301,6 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
 static inline void kvm_fpsimd_flush_cpu_state(void) {}
 
+static inline void kvm_arm_vhe_guest_enter(void) {}
+static inline void kvm_arm_vhe_guest_exit(void) {}
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7ee72b402907..dcdd08edf5a5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -398,4 +399,13 @@ static inline void kvm_fpsimd_flush_cpu_state(void)
 		sve_flush_cpu_state();
 }
 
+static inline void kvm_arm_vhe_guest_enter(void)
+{
+	local_daif_mask();
+}
+
+static inline void kvm_arm_vhe_guest_exit(void)
+{
+	local_daif_restore(DAIF_PROCCTX_NOIRQ);
+}
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2fc6009a766c..38e81631fc91 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -704,9 +704,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_enter_irqoff();
+		if (has_vhe())
+			kvm_arm_vhe_guest_enter();
 
 		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
+		if (has_vhe())
+			kvm_arm_vhe_guest_exit();
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
 		/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 08/13] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __{,de}activate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
Changes since v4:
 * Fixed an uncatagorized typo

 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  3 +++
 arch/arm64/include/asm/sysreg.h      |  1 +
 arch/arm64/kvm/hyp/switch.c          |  3 +++
 arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
 5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5f28dfa14cee..6d3614795197 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -64,6 +64,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index dcdd08edf5a5..3014b39b8fe2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -280,6 +280,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 115b89aeec00..aec10f75a43c 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -355,6 +355,7 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 324f4202cdd5..b425b8aab45b 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -94,6 +94,9 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 
 	write_sysreg(val, hcr_el2);
 
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 8ecbcb40e317..60666a056944 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -164,14 +164,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncategorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 08/13] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __{,de}activate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
Changes since v4:
 * Fixed an uncatagorized typo

 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  3 +++
 arch/arm64/include/asm/sysreg.h      |  1 +
 arch/arm64/kvm/hyp/switch.c          |  3 +++
 arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
 5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5f28dfa14cee..6d3614795197 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -64,6 +64,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index dcdd08edf5a5..3014b39b8fe2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -280,6 +280,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 115b89aeec00..aec10f75a43c 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -355,6 +355,7 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 324f4202cdd5..b425b8aab45b 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -94,6 +94,9 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 
 	write_sysreg(val, hcr_el2);
 
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 8ecbcb40e317..60666a056944 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -164,14 +164,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncategorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 09/13] KVM: arm64: Save/Restore guest DISR_EL1
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
v5 of this patch missed some Reviewed-bys, which came from here:
https://patchwork.kernel.org/patch/10017465/

 arch/arm64/include/asm/kvm_host.h | 1 +
 arch/arm64/include/asm/sysreg.h   | 1 +
 arch/arm64/kvm/hyp/sysreg-sr.c    | 6 ++++++
 arch/arm64/kvm/sys_regs.c         | 1 +
 4 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3014b39b8fe2..84fcb2a896a1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -121,6 +121,7 @@ enum vcpu_sysreg {
 	PAR_EL1,	/* Physical Address Register */
 	MDSCR_EL1,	/* Monitor Debug System Control Register */
 	MDCCINT_EL1,	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+	DISR_EL1,	/* Deferred Interrupt Status Register */
 
 	/* Performance Monitors Registers */
 	PMCR_EL0,	/* Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index aec10f75a43c..e213bd92e998 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -358,6 +358,7 @@
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
+#define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
 #define SYS_ICH_AP0R0_EL2		__SYS__AP0Rx_EL2(0)
 #define SYS_ICH_AP0R1_EL2		__SYS__AP0Rx_EL2(1)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index c54cc2afb92b..2c17afd2be96 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -66,6 +66,9 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
 	ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
 	ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);
 	ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
@@ -119,6 +122,9 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
 	write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
 	write_sysreg_el2(ctxt->gp_regs.regs.pc,		elr);
 	write_sysreg_el2(ctxt->gp_regs.regs.pstate,	spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1830ebc227d1..9edf4ac8a320 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1169,6 +1169,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
+	{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
 
 	{ SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only },
 	{ SYS_DESC(SYS_ICC_EOIR0_EL1), read_from_write_only },
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 09/13] KVM: arm64: Save/Restore guest DISR_EL1
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
v5 of this patch missed some Reviewed-bys, which came from here:
https://patchwork.kernel.org/patch/10017465/

 arch/arm64/include/asm/kvm_host.h | 1 +
 arch/arm64/include/asm/sysreg.h   | 1 +
 arch/arm64/kvm/hyp/sysreg-sr.c    | 6 ++++++
 arch/arm64/kvm/sys_regs.c         | 1 +
 4 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3014b39b8fe2..84fcb2a896a1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -121,6 +121,7 @@ enum vcpu_sysreg {
 	PAR_EL1,	/* Physical Address Register */
 	MDSCR_EL1,	/* Monitor Debug System Control Register */
 	MDCCINT_EL1,	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+	DISR_EL1,	/* Deferred Interrupt Status Register */
 
 	/* Performance Monitors Registers */
 	PMCR_EL0,	/* Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index aec10f75a43c..e213bd92e998 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -358,6 +358,7 @@
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
+#define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
 #define SYS_ICH_AP0R0_EL2		__SYS__AP0Rx_EL2(0)
 #define SYS_ICH_AP0R1_EL2		__SYS__AP0Rx_EL2(1)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index c54cc2afb92b..2c17afd2be96 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -66,6 +66,9 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
 	ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
 	ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);
 	ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
@@ -119,6 +122,9 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
 	write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
 	write_sysreg_el2(ctxt->gp_regs.regs.pc,		elr);
 	write_sysreg_el2(ctxt->gp_regs.regs.pstate,	spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1830ebc227d1..9edf4ac8a320 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1169,6 +1169,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
+	{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
 
 	{ SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only },
 	{ SYS_DESC(SYS_ICC_EOIR0_EL1), read_from_write_only },
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 10/13] KVM: arm64: Save ESR_EL2 on guest SError
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Switched to Marc's exit_code != irq version

(Christoffer gave Reviewed-by for v2, which I missed (sorry), the patch
 has changed since then..)

 arch/arm64/kvm/hyp/switch.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index b425b8aab45b..036e1f3d77a6 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -239,11 +239,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -336,6 +337,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	exit_code = __guest_enter(vcpu, host_ctxt);
 	/* And we're baaack! */
 
+	if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 	/*
 	 * We're using the raw exception code in order to only process
 	 * the trap if no SError is pending. We will come back to the
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 10/13] KVM: arm64: Save ESR_EL2 on guest SError
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Switched to Marc's exit_code != irq version

(Christoffer gave Reviewed-by for v2, which I missed (sorry), the patch
 has changed since then..)

 arch/arm64/kvm/hyp/switch.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index b425b8aab45b..036e1f3d77a6 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -239,11 +239,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -336,6 +337,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	exit_code = __guest_enter(vcpu, host_ctxt);
 	/* And we're baaack! */
 
+	if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 	/*
 	 * We're using the raw exception code in order to only process
 	 * the trap if no SError is pending. We will come back to the
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained and
uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
errors compromise the host too. All other error types are contained:
For the fatal errors the vCPU can't make progress, so we inject a virtual
SError. We ignore contained errors where we can make progress as if
we're lucky, we may not hit them again.

If only some of the CPUs support RAS the guest will see the cpufeature
sanitised version of the id registers, but we may still take RAS SError
on this CPU. Move the SError handling out of handle_exit() into a new
handler that runs before we can be preempted. This allows us to use
this_cpu_has_cap(), via arm64_is_ras_serror().

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Moved SError handling into handle_exit_early(). This will need to move
   earlier, into an SError-masked region once we support kernel-first.
   (hence the vauge name)
 * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().

 arch/arm/include/asm/kvm_host.h   |  3 +++
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/kvm/handle_exit.c      | 18 +++++++++++++++++-
 virt/kvm/arm/arm.c                |  3 +++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b86fc4162539..acbf9ec7b396 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -238,6 +238,9 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
 
+static inline void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				     int exception_index) {}
+
 static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 				       unsigned long hyp_stack_ptr,
 				       unsigned long vector_ptr)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 84fcb2a896a1..abcfd164e690 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -347,6 +347,8 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index);
 
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 304203fa9e33..6a5a5db4292f 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -29,12 +29,19 @@
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_psci.h>
 #include <asm/debug-monitors.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
+static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(NULL, esr))
+		kvm_inject_vabt(vcpu);
+}
+
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret;
@@ -252,7 +259,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
 		/* We may still need to return for single-step */
 		if (!(*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
 			&& kvm_arm_handle_step_debug(vcpu, run))
@@ -275,3 +281,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return 0;
 	}
 }
+
+/* For exit types that need handling before we can be preempted */
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index)
+{
+	exception_index = ARM_EXCEPTION_CODE(exception_index);
+
+	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
+}
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 38e81631fc91..15bf026eb182 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -763,6 +763,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		guest_exit();
 		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 
+		/* Exit types that need handling before we can be preempted */
+		handle_exit_early(vcpu, run, ret);
+
 		preempt_enable();
 
 		ret = handle_exit(vcpu, run, ret);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained and
uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
errors compromise the host too. All other error types are contained:
For the fatal errors the vCPU can't make progress, so we inject a virtual
SError. We ignore contained errors where we can make progress as if
we're lucky, we may not hit them again.

If only some of the CPUs support RAS the guest will see the cpufeature
sanitised version of the id registers, but we may still take RAS SError
on this CPU. Move the SError handling out of handle_exit() into a new
handler that runs before we can be preempted. This allows us to use
this_cpu_has_cap(), via arm64_is_ras_serror().

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Moved SError handling into handle_exit_early(). This will need to move
   earlier, into an SError-masked region once we support kernel-first.
   (hence the vauge name)
 * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().

 arch/arm/include/asm/kvm_host.h   |  3 +++
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/kvm/handle_exit.c      | 18 +++++++++++++++++-
 virt/kvm/arm/arm.c                |  3 +++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b86fc4162539..acbf9ec7b396 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -238,6 +238,9 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
 
+static inline void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				     int exception_index) {}
+
 static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 				       unsigned long hyp_stack_ptr,
 				       unsigned long vector_ptr)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 84fcb2a896a1..abcfd164e690 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -347,6 +347,8 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index);
 
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 304203fa9e33..6a5a5db4292f 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -29,12 +29,19 @@
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_psci.h>
 #include <asm/debug-monitors.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
+static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(NULL, esr))
+		kvm_inject_vabt(vcpu);
+}
+
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret;
@@ -252,7 +259,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
 		/* We may still need to return for single-step */
 		if (!(*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
 			&& kvm_arm_handle_step_debug(vcpu, run))
@@ -275,3 +281,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return 0;
 	}
 }
+
+/* For exit types that need handling before we can be preempted */
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index)
+{
+	exception_index = ARM_EXCEPTION_CODE(exception_index);
+
+	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
+}
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 38e81631fc91..15bf026eb182 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -763,6 +763,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		guest_exit();
 		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 
+		/* Exit types that need handling before we can be preempted */
+		handle_exit_early(vcpu, run, ret);
+
 		preempt_enable();
 
 		ret = handle_exit(vcpu, run, ret);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Moved the SError handling into handle_exit_early()
 * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().

Changes since v3:
 * Moved that nop out of the firing line

 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 14 +++++++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 6d3614795197..e002ab7f919a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -176,6 +176,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index abcfd164e690..4485ae8e98de 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -90,6 +90,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1dcc493f5765..1303e04110cd 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -132,6 +132,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 6a5a5db4292f..c09fc5a576c7 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -249,7 +250,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			*vcpu_pc(vcpu) -= adj;
 		}
 
-		kvm_inject_vabt(vcpu);
 		return 1;
 	}
 
@@ -286,6 +286,18 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		       int exception_index)
 {
+	if (ARM_SERROR_PENDING(exception_index)) {
+		if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
+			u64 disr = kvm_vcpu_get_disr(vcpu);
+
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+		} else {
+			kvm_inject_vabt(vcpu);
+		}
+
+		return;
+	}
+
 	exception_index = ARM_EXCEPTION_CODE(exception_index);
 
 	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index fe4678f20a85..fdd1068ee3a5 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -124,6 +124,17 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+alternative_if ARM64_HAS_RAS_EXTN
+	// If we have the RAS extensions we can consume a pending error
+	// without an unmask-SError and isb.
+	esb
+	mrs_s	x2, SYS_DISR_EL1
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	msr_s	SYS_DISR_EL1, xzr
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
@@ -134,7 +145,9 @@ ENTRY(__guest_exit)
 	mov	x5, x0
 
 	dsb	sy		// Synchronize against in-flight ld/st
+	nop
 	msr	daifclr, #4	// Unmask aborts
+alternative_endif
 
 	// This is our single instruction exception window. A pending
 	// SError is guaranteed to occur at the earliest when we unmask
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Moved the SError handling into handle_exit_early()
 * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().

Changes since v3:
 * Moved that nop out of the firing line

 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 14 +++++++++++++-
 arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 6d3614795197..e002ab7f919a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -176,6 +176,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index abcfd164e690..4485ae8e98de 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -90,6 +90,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1dcc493f5765..1303e04110cd 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -132,6 +132,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 6a5a5db4292f..c09fc5a576c7 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -249,7 +250,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			*vcpu_pc(vcpu) -= adj;
 		}
 
-		kvm_inject_vabt(vcpu);
 		return 1;
 	}
 
@@ -286,6 +286,18 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		       int exception_index)
 {
+	if (ARM_SERROR_PENDING(exception_index)) {
+		if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
+			u64 disr = kvm_vcpu_get_disr(vcpu);
+
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+		} else {
+			kvm_inject_vabt(vcpu);
+		}
+
+		return;
+	}
+
 	exception_index = ARM_EXCEPTION_CODE(exception_index);
 
 	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index fe4678f20a85..fdd1068ee3a5 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -124,6 +124,17 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+alternative_if ARM64_HAS_RAS_EXTN
+	// If we have the RAS extensions we can consume a pending error
+	// without an unmask-SError and isb.
+	esb
+	mrs_s	x2, SYS_DISR_EL1
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	msr_s	SYS_DISR_EL1, xzr
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
@@ -134,7 +145,9 @@ ENTRY(__guest_exit)
 	mov	x5, x0
 
 	dsb	sy		// Synchronize against in-flight ld/st
+	nop
 	msr	daifclr, #4	// Unmask aborts
+alternative_endif
 
 	// This is our single instruction exception window. A pending
 	// SError is guaranteed to occur@the earliest when we unmask
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 13/13] KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
  2018-01-15 19:38 ` James Morse
@ 2018-01-15 19:39   ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

From: Dongjiu Geng <gengdongjiu@huawei.com>

ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
[removed specific emulation, use trap_raz_wi() directly for everything,
 rephrased parts of the commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
v5 of this series/patch missed the Reviewed-by's, which come from here:
https://patchwork.kernel.org/patch/10017537/

 arch/arm64/include/asm/kvm_arm.h     |  2 ++
 arch/arm64/include/asm/kvm_emulate.h |  7 +++++++
 arch/arm64/include/asm/sysreg.h      | 10 ++++++++++
 arch/arm64/kvm/sys_regs.c            | 10 ++++++++++
 4 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 715d395ef45b..b0c84171e6a3 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA		(UL(1) << 37)
+#define HCR_TERR	(UL(1) << 36)
 #define HCR_E2H		(UL(1) << 34)
 #define HCR_ID		(UL(1) << 33)
 #define HCR_CD		(UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e002ab7f919a..413dc82b1e89 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -50,6 +50,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
 	if (is_kernel_in_hyp_mode())
 		vcpu->arch.hcr_el2 |= HCR_E2H;
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+		/* route synchronous external abort exceptions to EL2 */
+		vcpu->arch.hcr_el2 |= HCR_TEA;
+		/* trap error record accesses */
+		vcpu->arch.hcr_el2 |= HCR_TERR;
+	}
+
 	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
 		vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e213bd92e998..a16eb8d00689 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -176,6 +176,16 @@
 #define SYS_AFSR0_EL1			sys_reg(3, 0, 5, 1, 0)
 #define SYS_AFSR1_EL1			sys_reg(3, 0, 5, 1, 1)
 #define SYS_ESR_EL1			sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1			sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1			sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1			sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1			sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1		sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9edf4ac8a320..50a43c7b97ca 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1159,6 +1159,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
 	{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
 	{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+	{ SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
 	{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
 	{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v6 13/13] KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
@ 2018-01-15 19:39   ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-15 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dongjiu Geng <gengdongjiu@huawei.com>

ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
[removed specific emulation, use trap_raz_wi() directly for everything,
 rephrased parts of the commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
v5 of this series/patch missed the Reviewed-by's, which come from here:
https://patchwork.kernel.org/patch/10017537/

 arch/arm64/include/asm/kvm_arm.h     |  2 ++
 arch/arm64/include/asm/kvm_emulate.h |  7 +++++++
 arch/arm64/include/asm/sysreg.h      | 10 ++++++++++
 arch/arm64/kvm/sys_regs.c            | 10 ++++++++++
 4 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 715d395ef45b..b0c84171e6a3 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA		(UL(1) << 37)
+#define HCR_TERR	(UL(1) << 36)
 #define HCR_E2H		(UL(1) << 34)
 #define HCR_ID		(UL(1) << 33)
 #define HCR_CD		(UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e002ab7f919a..413dc82b1e89 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -50,6 +50,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
 	if (is_kernel_in_hyp_mode())
 		vcpu->arch.hcr_el2 |= HCR_E2H;
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+		/* route synchronous external abort exceptions to EL2 */
+		vcpu->arch.hcr_el2 |= HCR_TEA;
+		/* trap error record accesses */
+		vcpu->arch.hcr_el2 |= HCR_TERR;
+	}
+
 	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
 		vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e213bd92e998..a16eb8d00689 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -176,6 +176,16 @@
 #define SYS_AFSR0_EL1			sys_reg(3, 0, 5, 1, 0)
 #define SYS_AFSR1_EL1			sys_reg(3, 0, 5, 1, 1)
 #define SYS_ESR_EL1			sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1			sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1			sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1			sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1			sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1		sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9edf4ac8a320..50a43c7b97ca 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1159,6 +1159,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
 	{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
 	{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+	{ SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
 	{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
 	{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-15 19:39   ` James Morse
  (?)
@ 2018-01-16  9:29   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16  9:29 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:39, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> For SError that interrupt a guest and are routed to EL2 the existing
> behaviour is to inject an impdef SError into the guest.
> 
> Add code to handle RAS SError based on the ESR. For uncontained and
> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
> errors compromise the host too. All other error types are contained:
> For the fatal errors the vCPU can't make progress, so we inject a virtual
> SError. We ignore contained errors where we can make progress as if
> we're lucky, we may not hit them again.
> 
> If only some of the CPUs support RAS the guest will see the cpufeature
> sanitised version of the id registers, but we may still take RAS SError
> on this CPU. Move the SError handling out of handle_exit() into a new
> handler that runs before we can be preempted. This allows us to use
> this_cpu_has_cap(), via arm64_is_ras_serror().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved SError handling into handle_exit_early(). This will need to move
>    earlier, into an SError-masked region once we support kernel-first.
>    (hence the vauge name)
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 
>  arch/arm/include/asm/kvm_host.h   |  3 +++
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/handle_exit.c      | 18 +++++++++++++++++-
>  virt/kvm/arm/arm.c                |  3 +++
>  4 files changed, 25 insertions(+), 1 deletion(-)

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2018-01-15 19:39   ` James Morse
  (?)
@ 2018-01-16  9:36   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16  9:36 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:39, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> The current SError from EL2 code unmasks SError and tries to fence any
> pending SError into a single instruction window. It then leaves SError
> unmasked.
> 
> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
> error, but KVM is only able to handle SError from EL2 if they occur
> during this single instruction window...
> 
> The RAS Extensions give us a new instruction to synchronise and
> consume SErrors. The RAS Extensions document (ARM DDI0587),
> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
> SError interrupts generated by 'instructions, translation table walks,
> hardware updates to the translation tables, and instruction fetches on
> the same PE'. This makes ESB equivalent to KVMs existing
> 'dsb, mrs-daifclr, isb' sequence.
> 
> Use the alternatives to synchronise and consume any SError using ESB
> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
> in the exit_code so that we can restart the vcpu if it turns out this
> SError has no impact on the vcpu.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved the SError handling into handle_exit_early()
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 
> Changes since v3:
>  * Moved that nop out of the firing line
> 
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  1 +
>  arch/arm64/kernel/asm-offsets.c      |  1 +
>  arch/arm64/kvm/handle_exit.c         | 14 +++++++++++++-
>  arch/arm64/kvm/hyp/entry.S           | 13 +++++++++++++
>  5 files changed, 33 insertions(+), 1 deletion(-)

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 10/13] KVM: arm64: Save ESR_EL2 on guest SError
  2018-01-15 19:39   ` James Morse
  (?)
@ 2018-01-16  9:41   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16  9:41 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:39, James Morse wrote:
> When we exit a guest due to an SError the vcpu fault info isn't updated
> with the ESR. Today this is only done for traps.
> 
> The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
> fault_info with the ESR on SError so that handle_exit() can determine
> if this was a RAS SError and decode its severity.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Switched to Marc's exit_code != irq version
> 
> (Christoffer gave Reviewed-by for v2, which I missed (sorry), the patch
>  has changed since then..)
> 
>  arch/arm64/kvm/hyp/switch.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  2018-01-15 19:38   ` James Morse
  (?)
@ 2018-01-16  9:51   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16  9:51 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:38, James Morse wrote:
> this_cpu_has_cap() tests caps->desc not caps->matches, so it stops
> walking the list when it finds a 'silent' feature, instead of
> walking to the end of the list.
> 
> Prior to v4.6's 644c2ae198412 ("arm64: cpufeature: Test 'matches' pointer
> to find the end of the list") we always tested desc to find the end of
> a capability list. This was changed for dubious things like PAN_NOT_UAO.
> v4.7's e3661b128e53e ("arm64: Allow a capability to be checked on
> single CPU") added this_cpu_has_cap() using the old desc style test.
> 
> CC: Suzuki K Poulose <suzuki.poulose@arm.com>
> CC: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 05/13] arm64: Unconditionally enable IESB on exception entry/return for firmware-first
  2018-01-15 19:38   ` James Morse
  (?)
@ 2018-01-16  9:55   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16  9:55 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:38, James Morse wrote:
> ARM v8.2 has a feature to add implicit error synchronization barriers
> whenever the CPU enters or returns from an exception level. Add this to the
> features we always enable. CPUs that don't support this feature will treat
> the bit as RES0.
> 
> This feature causes RAS errors that are not yet visible to software to
> become pending SErrors. We expect to have firmware-first RAS support
> so synchronised RAS errors will be take immediately to EL3.
> Any system without firmware-first handling of errors will take the SError
> either immediatly after exception return, or when we unmask SError after
> entry.S's work.
> 
> Adding IESB to the ELx flags causes it to be enabled by KVM and kexec
> too.
> 
> Platform level RAS support may require additional firmware support.
> 
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Suggested-by: Will Deacon <will.deacon@arm.com>
> Link: https://www.spinics.net/lists/kvm-arm/msg28192.html
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 07/13] KVM: arm/arm64: mask/unmask daif around VHE guests
  2018-01-15 19:39   ` James Morse
  (?)
@ 2018-01-16 10:01   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16 10:01 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Catalin Marinas, Will Deacon, Dongjiu Geng, kvmarm

On 15/01/18 19:39, James Morse wrote:
> Non-VHE systems take an exception to EL2 in order to world-switch into the
> guest. When returning from the guest KVM implicitly restores the DAIF
> flags when it returns to the kernel at EL1.
> 
> With VHE none of this exception-level jumping happens, so KVMs
> world-switch code is exposed to the host kernel's DAIF values, and KVM
> spills the guest-exit DAIF values back into the host kernel.
> On entry to a guest we have Debug and SError exceptions unmasked, KVM
> has switched VBAR but isn't prepared to handle these. On guest exit
> Debug exceptions are left disabled once we return to the host and will
> stay this way until we enter user space.
> 
> Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
> happen after the hosts VBAR value has been synchronised by the isb in
> __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
> setting KVMs VBAR value, but is kept here for symmetry.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 08/13] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  2018-01-15 19:39   ` James Morse
  (?)
@ 2018-01-16 10:05   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2018-01-16 10:05 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Xie XiuQi, Catalin Marinas, Suzuki K Poulose,
	Will Deacon, Dongjiu Geng, kvmarm, Christoffer Dall

On 15/01/18 19:39, James Morse wrote:
> Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
> generated an SError with an implementation defined ESR_EL1.ISS, because we
> had no mechanism to specify the ESR value.
> 
> On Juno this generates an all-zero ESR, the most significant bit 'ISV'
> is clear indicating the remainder of the ISS field is invalid.
> 
> With the RAS Extensions we have a mechanism to specify this value, and the
> most significant bit has a new meaning: 'IDS - Implementation Defined
> Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
> instead of 'no valid ISS'.
> 
> Add KVM support for the VSESR_EL2 register to specify an ESR value when
> HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
> specify an implementation-defined value.
> 
> We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
> save/restores this bit during __{,de}activate_traps() and hardware clears the
> bit once the guest has consumed the virtual-SError.
> 
> Future patches may add an API (or KVM CAP) to pend a virtual SError with
> a specified ESR.
> 
> Cc: Dongjiu Geng <gengdongjiu@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
> Changes since v4:
>  * Fixed an uncatagorized typo
> 
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++++
>  arch/arm64/include/asm/kvm_host.h    |  3 +++
>  arch/arm64/include/asm/sysreg.h      |  1 +
>  arch/arm64/kvm/hyp/switch.c          |  3 +++
>  arch/arm64/kvm/inject_fault.c        | 13 ++++++++++++-
>  5 files changed, 24 insertions(+), 1 deletion(-)

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-15 19:38   ` James Morse
  (?)
@ 2018-01-16 10:26   ` Suzuki K Poulose
  -1 siblings, 0 replies; 60+ messages in thread
From: Suzuki K Poulose @ 2018-01-16 10:26 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

On 15/01/18 19:38, James Morse wrote:
> From: Xie XiuQi <xiexiuqi@huawei.com>
> 
> ARM's v8.2 Extentions add support for Reliability, Availability and
> Serviceability (RAS). On CPUs with these extensions system software
> can use additional barriers to isolate errors and determine if faults
> are pending. Add cpufeature detection.
> 
> Platform level RAS support may require additional firmware support.
> 
> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
> [Rebased added config option, reworded commit message]
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 06/13] arm64: kernel: Prepare for a DISR user
  2018-01-15 19:38   ` James Morse
  (?)
@ 2018-01-16 11:11   ` Suzuki K Poulose
  -1 siblings, 0 replies; 60+ messages in thread
From: Suzuki K Poulose @ 2018-01-16 11:11 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm

On 15/01/18 19:38, James Morse wrote:
> KVM would like to consume any pending SError (or RAS error) after guest
> exit. Today it has to unmask SError and use dsb+isb to synchronise the
> CPU. With the RAS extensions we can use ESB to synchronise any pending
> SError.
> 
> Add the necessary macros to allow DISR to be read and converted to an
> ESR.
> 
> We clear the DISR register when we enable the RAS cpufeature, and the
> kernel has not executed any ESB instructions. Any value we find in DISR
> must have belonged to firmware. Executing an ESB instruction is the
> only way to update DISR, so we can expect firmware to have handled
> any deferred SError. By the same logic we clear DISR in the idle path.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


James,

Looks fine to me. minor nit below

>   #endif	/* __ASM_EXCEPTION_H */
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 023cacb946c3..cee4ae25a5d1 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -216,6 +216,7 @@ static inline void spin_lock_prefetch(const void *ptr)
>   
>   int cpu_enable_pan(void *__unused);
>   int cpu_enable_cache_maint_trap(void *__unused);
> +int cpu_clear_disr(void *__unused);
>   
>   /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
>   #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 1281bc8263c2..115b89aeec00 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -279,6 +279,7 @@
>   #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
>   
>   #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
> +#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)

minor nit: additional white space	                   ^^^

With that fixed,

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-15 19:38   ` James Morse
  (?)
  (?)
@ 2018-01-16 11:17   ` gengdongjiu
  2018-01-22 19:32       ` James Morse
  -1 siblings, 1 reply; 60+ messages in thread
From: gengdongjiu @ 2018-01-16 11:17 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon, kvmarm

Hi James,

On 2018/1/16 3:38, James Morse wrote:
> From: Xie XiuQi <xiexiuqi@huawei.com>
> 
> ARM's v8.2 Extentions add support for Reliability, Availability and
> Serviceability (RAS). On CPUs with these extensions system software
> can use additional barriers to isolate errors and determine if faults
> are pending. Add cpufeature detection.
> 
> Platform level RAS support may require additional firmware support.
> 
> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
> [Rebased added config option, reworded commit message]
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> Changes since v4:
>  * Removed barrier in context switch
> 
>  arch/arm64/Kconfig               | 16 ++++++++++++++++
>  arch/arm64/include/asm/cpucaps.h |  3 ++-
>  arch/arm64/include/asm/sysreg.h  |  2 ++
>  arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 664fadc2aa2e..1d51c8edf34b 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1062,6 +1062,22 @@ config ARM64_PMEM
>  	  operations if DC CVAP is not supported (following the behaviour of
>  	  DC CVAP itself if the system does not define a point of persistence).
>  
> +config ARM64_RAS_EXTN
> +	bool "Enable support for RAS CPU Extensions"
> +	default y
> +	help
> +	  CPUs that support the Reliability, Availability and Serviceability
> +	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
> +	  errors, classify them and report them to software.
> +
> +	  On CPUs with these extensions system software can use additional
> +	  barriers to determine if faults are pending and read the
> +	  classification from a new set of registers.
> +
> +	  Selecting this feature will allow the kernel to use these barriers
> +	  and access the new registers if the system supports the extension.
> +	  Platform RAS features may additionally depend on firmware support.
> +
>  endmenu
>  
it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
thanks

>  config ARM64_SVE
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 7049b4802587..bb263820de13 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -44,7 +44,8 @@
>  #define ARM64_UNMAP_KERNEL_AT_EL0		23
>  #define ARM64_HARDEN_BRANCH_PREDICTOR		24
>  #define ARM64_HARDEN_BP_POST_GUEST_EXIT		25
> +#define ARM64_HAS_RAS_EXTN			26
>  
> -#define ARM64_NCAPS				26
> +#define ARM64_NCAPS				27
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 1a8108f84932..321622e9f9c3 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -498,6 +498,7 @@
>  #define ID_AA64PFR0_CSV3_SHIFT		60
>  #define ID_AA64PFR0_CSV2_SHIFT		56
>  #define ID_AA64PFR0_SVE_SHIFT		32
> +#define ID_AA64PFR0_RAS_SHIFT		28
>  #define ID_AA64PFR0_GIC_SHIFT		24
>  #define ID_AA64PFR0_ASIMD_SHIFT		20
>  #define ID_AA64PFR0_FP_SHIFT		16
> @@ -507,6 +508,7 @@
>  #define ID_AA64PFR0_EL0_SHIFT		0
>  
>  #define ID_AA64PFR0_SVE			0x1
> +#define ID_AA64PFR0_RAS_V1		0x1
>  #define ID_AA64PFR0_FP_NI		0xf
>  #define ID_AA64PFR0_FP_SUPPORTED	0x0
>  #define ID_AA64PFR0_ASIMD_NI		0xf
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d88cd0e88606..0c0af18121e1 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -149,6 +149,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
>  	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
> +	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
>  	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
>  	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
> @@ -1028,6 +1029,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.enable = sve_kernel_enable,
>  	},
>  #endif /* CONFIG_ARM64_SVE */
> +#ifdef CONFIG_ARM64_RAS_EXTN
> +	{
> +		.desc = "RAS Extension Support",
> +		.capability = ARM64_HAS_RAS_EXTN,
> +		.def_scope = SCOPE_SYSTEM,
> +		.matches = has_cpuid_feature,
> +		.sys_reg = SYS_ID_AA64PFR0_EL1,
> +		.sign = FTR_UNSIGNED,
> +		.field_pos = ID_AA64PFR0_RAS_SHIFT,
> +		.min_field_value = ID_AA64PFR0_RAS_V1,
> +	},
> +#endif /* CONFIG_ARM64_RAS_EXTN */
>  	{},
>  };
>  
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  2018-01-15 19:38   ` James Morse
  (?)
  (?)
@ 2018-01-16 15:04   ` Catalin Marinas
  2018-01-16 15:09     ` Suzuki K Poulose
  -1 siblings, 1 reply; 60+ messages in thread
From: Catalin Marinas @ 2018-01-16 15:04 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, Dongjiu Geng, kvmarm,
	linux-arm-kernel

On Mon, Jan 15, 2018 at 07:38:54PM +0000, James Morse wrote:
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9ef84d0def9a..d88cd0e88606 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1303,8 +1303,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
>  	if (WARN_ON(preemptible()))
>  		return false;
>  
> -	for (caps = cap_array; caps->desc; caps++)
> -		if (caps->capability == cap && caps->matches)
> +	for (caps = cap_array; caps->matches; caps++)
> +		if (caps->capability == cap)
>  			return caps->matches(caps, SCOPE_LOCAL_CPU);
>  
>  	return false;

Just to make sure I applied this correctly on top of for-next/core:

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a11311397430..630a40ec1332 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1149,9 +1149,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
 	if (WARN_ON(preemptible()))
 		return false;
 
-	for (caps = cap_array; caps->desc; caps++)
+	for (caps = cap_array; caps->matches; caps++)
 		if (caps->capability == cap &&
-		    caps->matches &&
 		    caps->matches(caps, SCOPE_LOCAL_CPU))
 			return true;
 	return false;

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  2018-01-16 15:04   ` Catalin Marinas
@ 2018-01-16 15:09     ` Suzuki K Poulose
  0 siblings, 0 replies; 60+ messages in thread
From: Suzuki K Poulose @ 2018-01-16 15:09 UTC (permalink / raw)
  To: Catalin Marinas, James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, Dongjiu Geng, kvmarm,
	linux-arm-kernel

On 16/01/18 15:04, Catalin Marinas wrote:
> On Mon, Jan 15, 2018 at 07:38:54PM +0000, James Morse wrote:
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 9ef84d0def9a..d88cd0e88606 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -1303,8 +1303,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
>>   	if (WARN_ON(preemptible()))
>>   		return false;
>>   
>> -	for (caps = cap_array; caps->desc; caps++)
>> -		if (caps->capability == cap && caps->matches)
>> +	for (caps = cap_array; caps->matches; caps++)
>> +		if (caps->capability == cap)
>>   			return caps->matches(caps, SCOPE_LOCAL_CPU);
>>   
>>   	return false;
> 
> Just to make sure I applied this correctly on top of for-next/core:
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index a11311397430..630a40ec1332 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1149,9 +1149,8 @@ static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
>   	if (WARN_ON(preemptible()))
>   		return false;
>   
> -	for (caps = cap_array; caps->desc; caps++)
> +	for (caps = cap_array; caps->matches; caps++)
>   		if (caps->capability == cap &&
> -		    caps->matches &&
>   		    caps->matches(caps, SCOPE_LOCAL_CPU))
>   			return true;
>   	return false;
> 

Looks correct to me.

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 00/13] arm64/KVM: RAS & IESB for firmware first support
  2018-01-15 19:38 ` James Morse
                   ` (13 preceding siblings ...)
  (?)
@ 2018-01-16 17:36 ` Catalin Marinas
  -1 siblings, 0 replies; 60+ messages in thread
From: Catalin Marinas @ 2018-01-16 17:36 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Will Deacon, Dongjiu Geng, kvmarm,
	linux-arm-kernel

On Mon, Jan 15, 2018 at 07:38:53PM +0000, James Morse wrote:
> Dongjiu Geng (1):
>   KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
> 
> James Morse (11):
>   arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
>   arm64: sysreg: Move to use definitions for all the SCTLR bits
>   arm64: kernel: Survive corrected RAS errors notified by SError
>   arm64: Unconditionally enable IESB on exception entry/return for
>     firmware-first
>   arm64: kernel: Prepare for a DISR user
>   KVM: arm/arm64: mask/unmask daif around VHE guests
>   KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
>   KVM: arm64: Save/Restore guest DISR_EL1
>   KVM: arm64: Save ESR_EL2 on guest SError
>   KVM: arm64: Handle RAS SErrors from EL1 on guest exit
>   KVM: arm64: Handle RAS SErrors from EL2 on guest exit
> 
> Xie XiuQi (1):
>   arm64: cpufeature: Detect CPU RAS Extentions

Queued for 4.16. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-15 19:39   ` James Morse
@ 2018-01-19 19:20     ` Christoffer Dall
  -1 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-19 19:20 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm, linux-arm-kernel

On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> For SError that interrupt a guest and are routed to EL2 the existing
> behaviour is to inject an impdef SError into the guest.
> 
> Add code to handle RAS SError based on the ESR. For uncontained and
> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
> errors compromise the host too. All other error types are contained:
> For the fatal errors the vCPU can't make progress, so we inject a virtual
> SError. We ignore contained errors where we can make progress as if
> we're lucky, we may not hit them again.
> 
> If only some of the CPUs support RAS the guest will see the cpufeature
> sanitised version of the id registers, but we may still take RAS SError
> on this CPU. Move the SError handling out of handle_exit() into a new
> handler that runs before we can be preempted. This allows us to use
> this_cpu_has_cap(), via arm64_is_ras_serror().

Would it be possible to optimize this a bit later on by caching
this_cpu_has_cap() in vcpu_load() so that we can use a single
handle_exit function to process all exits?

Thanks,
-Christoffer

> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved SError handling into handle_exit_early(). This will need to move
>    earlier, into an SError-masked region once we support kernel-first.
>    (hence the vauge name)
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 
>  arch/arm/include/asm/kvm_host.h   |  3 +++
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/handle_exit.c      | 18 +++++++++++++++++-
>  virt/kvm/arm/arm.c                |  3 +++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b86fc4162539..acbf9ec7b396 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -238,6 +238,9 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>  int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		int exception_index);
>  
> +static inline void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +				     int exception_index) {}
> +
>  static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>  				       unsigned long hyp_stack_ptr,
>  				       unsigned long vector_ptr)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 84fcb2a896a1..abcfd164e690 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -347,6 +347,8 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
>  int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		int exception_index);
> +void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +		       int exception_index);
>  
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 304203fa9e33..6a5a5db4292f 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -29,12 +29,19 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_psci.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/traps.h>
>  
>  #define CREATE_TRACE_POINTS
>  #include "trace.h"
>  
>  typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
>  
> +static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
> +{
> +	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(NULL, esr))
> +		kvm_inject_vabt(vcpu);
> +}
> +
>  static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
>  	int ret;
> @@ -252,7 +259,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	case ARM_EXCEPTION_IRQ:
>  		return 1;
>  	case ARM_EXCEPTION_EL1_SERROR:
> -		kvm_inject_vabt(vcpu);
>  		/* We may still need to return for single-step */
>  		if (!(*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
>  			&& kvm_arm_handle_step_debug(vcpu, run))
> @@ -275,3 +281,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		return 0;
>  	}
>  }
> +
> +/* For exit types that need handling before we can be preempted */
> +void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +		       int exception_index)
> +{
> +	exception_index = ARM_EXCEPTION_CODE(exception_index);
> +
> +	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
> +		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
> +}
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 38e81631fc91..15bf026eb182 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -763,6 +763,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		guest_exit();
>  		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>  
> +		/* Exit types that need handling before we can be preempted */
> +		handle_exit_early(vcpu, run, ret);
> +
>  		preempt_enable();
>  
>  		ret = handle_exit(vcpu, run, ret);
> -- 
> 2.15.1
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2018-01-19 19:20     ` Christoffer Dall
  0 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-19 19:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> For SError that interrupt a guest and are routed to EL2 the existing
> behaviour is to inject an impdef SError into the guest.
> 
> Add code to handle RAS SError based on the ESR. For uncontained and
> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
> errors compromise the host too. All other error types are contained:
> For the fatal errors the vCPU can't make progress, so we inject a virtual
> SError. We ignore contained errors where we can make progress as if
> we're lucky, we may not hit them again.
> 
> If only some of the CPUs support RAS the guest will see the cpufeature
> sanitised version of the id registers, but we may still take RAS SError
> on this CPU. Move the SError handling out of handle_exit() into a new
> handler that runs before we can be preempted. This allows us to use
> this_cpu_has_cap(), via arm64_is_ras_serror().

Would it be possible to optimize this a bit later on by caching
this_cpu_has_cap() in vcpu_load() so that we can use a single
handle_exit function to process all exits?

Thanks,
-Christoffer

> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved SError handling into handle_exit_early(). This will need to move
>    earlier, into an SError-masked region once we support kernel-first.
>    (hence the vauge name)
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 
>  arch/arm/include/asm/kvm_host.h   |  3 +++
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/handle_exit.c      | 18 +++++++++++++++++-
>  virt/kvm/arm/arm.c                |  3 +++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b86fc4162539..acbf9ec7b396 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -238,6 +238,9 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>  int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		int exception_index);
>  
> +static inline void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +				     int exception_index) {}
> +
>  static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>  				       unsigned long hyp_stack_ptr,
>  				       unsigned long vector_ptr)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 84fcb2a896a1..abcfd164e690 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -347,6 +347,8 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
>  int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		int exception_index);
> +void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +		       int exception_index);
>  
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 304203fa9e33..6a5a5db4292f 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -29,12 +29,19 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_psci.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/traps.h>
>  
>  #define CREATE_TRACE_POINTS
>  #include "trace.h"
>  
>  typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
>  
> +static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
> +{
> +	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(NULL, esr))
> +		kvm_inject_vabt(vcpu);
> +}
> +
>  static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
>  	int ret;
> @@ -252,7 +259,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	case ARM_EXCEPTION_IRQ:
>  		return 1;
>  	case ARM_EXCEPTION_EL1_SERROR:
> -		kvm_inject_vabt(vcpu);
>  		/* We may still need to return for single-step */
>  		if (!(*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
>  			&& kvm_arm_handle_step_debug(vcpu, run))
> @@ -275,3 +281,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		return 0;
>  	}
>  }
> +
> +/* For exit types that need handling before we can be preempted */
> +void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +		       int exception_index)
> +{
> +	exception_index = ARM_EXCEPTION_CODE(exception_index);
> +
> +	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
> +		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
> +}
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 38e81631fc91..15bf026eb182 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -763,6 +763,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		guest_exit();
>  		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>  
> +		/* Exit types that need handling before we can be preempted */
> +		handle_exit_early(vcpu, run, ret);
> +
>  		preempt_enable();
>  
>  		ret = handle_exit(vcpu, run, ret);
> -- 
> 2.15.1
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  2018-01-15 19:39   ` James Morse
@ 2018-01-19 19:54     ` Christoffer Dall
  -1 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-19 19:54 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm, linux-arm-kernel

On Mon, Jan 15, 2018 at 07:39:05PM +0000, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> The current SError from EL2 code unmasks SError and tries to fence any
> pending SError into a single instruction window. It then leaves SError
> unmasked.
> 
> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
> error, but KVM is only able to handle SError from EL2 if they occur
> during this single instruction window...
> 
> The RAS Extensions give us a new instruction to synchronise and
> consume SErrors. The RAS Extensions document (ARM DDI0587),
> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
> SError interrupts generated by 'instructions, translation table walks,
> hardware updates to the translation tables, and instruction fetches on
> the same PE'. This makes ESB equivalent to KVMs existing
> 'dsb, mrs-daifclr, isb' sequence.
> 
> Use the alternatives to synchronise and consume any SError using ESB
> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
> in the exit_code so that we can restart the vcpu if it turns out this
> SError has no impact on the vcpu.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved the SError handling into handle_exit_early()
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 

I realize this is queued, but for good measure, I'm still happy with
this change after handle_exit_early().

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
@ 2018-01-19 19:54     ` Christoffer Dall
  0 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-19 19:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 15, 2018 at 07:39:05PM +0000, James Morse wrote:
> We expect to have firmware-first handling of RAS SErrors, with errors
> notified via an APEI method. For systems without firmware-first, add
> some minimal handling to KVM.
> 
> There are two ways KVM can take an SError due to a guest, either may be a
> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> 
> The current SError from EL2 code unmasks SError and tries to fence any
> pending SError into a single instruction window. It then leaves SError
> unmasked.
> 
> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
> error, but KVM is only able to handle SError from EL2 if they occur
> during this single instruction window...
> 
> The RAS Extensions give us a new instruction to synchronise and
> consume SErrors. The RAS Extensions document (ARM DDI0587),
> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
> SError interrupts generated by 'instructions, translation table walks,
> hardware updates to the translation tables, and instruction fetches on
> the same PE'. This makes ESB equivalent to KVMs existing
> 'dsb, mrs-daifclr, isb' sequence.
> 
> Use the alternatives to synchronise and consume any SError using ESB
> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
> in the exit_code so that we can restart the vcpu if it turns out this
> SError has no impact on the vcpu.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v4:
>  * Moved the SError handling into handle_exit_early()
>  * Dropped Marc & Christoffer's Reviewed-by due to handle_exit_early().
> 

I realize this is queued, but for good measure, I'm still happy with
this change after handle_exit_early().

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-19 19:20     ` Christoffer Dall
@ 2018-01-22 18:18       ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-22 18:18 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm, linux-arm-kernel

Hi Christoffer,

On 19/01/18 19:20, Christoffer Dall wrote:
> On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> For SError that interrupt a guest and are routed to EL2 the existing
>> behaviour is to inject an impdef SError into the guest.
>>
>> Add code to handle RAS SError based on the ESR. For uncontained and
>> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
>> errors compromise the host too. All other error types are contained:
>> For the fatal errors the vCPU can't make progress, so we inject a virtual
>> SError. We ignore contained errors where we can make progress as if
>> we're lucky, we may not hit them again.
>>
>> If only some of the CPUs support RAS the guest will see the cpufeature
>> sanitised version of the id registers, but we may still take RAS SError
>> on this CPU. Move the SError handling out of handle_exit() into a new
>> handler that runs before we can be preempted. This allows us to use
>> this_cpu_has_cap(), via arm64_is_ras_serror().
> 
> Would it be possible to optimize this a bit later on by caching
> this_cpu_has_cap() in vcpu_load() so that we can use a single
> handle_exit function to process all exits?

If vcpu_load() prevents pre-emption between the guest-exit exception and the
this_cpu_has_cap() test then we wouldn't need a separate handle_exit().

But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
unmask SError until we've fed the guest-exit:SError into the RAS code. This
would also need the SError related handle_exit() calls to be separate/earlier.
(there was some verbiage on this in the cover letter).

(I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
put an end to that).


In terms of caching this_cpu_has_cap() value, is this due to a performance
concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
so we've already taken an SError out of the guest. Once its all put together
we're likely to have a pending signal for user-space.
'Corrected' (or at least ignorable) errors are going to be the odd one out, I
don't think we should worry about these!


Thanks,

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2018-01-22 18:18       ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-22 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 19/01/18 19:20, Christoffer Dall wrote:
> On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> For SError that interrupt a guest and are routed to EL2 the existing
>> behaviour is to inject an impdef SError into the guest.
>>
>> Add code to handle RAS SError based on the ESR. For uncontained and
>> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
>> errors compromise the host too. All other error types are contained:
>> For the fatal errors the vCPU can't make progress, so we inject a virtual
>> SError. We ignore contained errors where we can make progress as if
>> we're lucky, we may not hit them again.
>>
>> If only some of the CPUs support RAS the guest will see the cpufeature
>> sanitised version of the id registers, but we may still take RAS SError
>> on this CPU. Move the SError handling out of handle_exit() into a new
>> handler that runs before we can be preempted. This allows us to use
>> this_cpu_has_cap(), via arm64_is_ras_serror().
> 
> Would it be possible to optimize this a bit later on by caching
> this_cpu_has_cap() in vcpu_load() so that we can use a single
> handle_exit function to process all exits?

If vcpu_load() prevents pre-emption between the guest-exit exception and the
this_cpu_has_cap() test then we wouldn't need a separate handle_exit().

But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
unmask SError until we've fed the guest-exit:SError into the RAS code. This
would also need the SError related handle_exit() calls to be separate/earlier.
(there was some verbiage on this in the cover letter).

(I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
put an end to that).


In terms of caching this_cpu_has_cap() value, is this due to a performance
concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
so we've already taken an SError out of the guest. Once its all put together
we're likely to have a pending signal for user-space.
'Corrected' (or at least ignorable) errors are going to be the odd one out, I
don't think we should worry about these!


Thanks,

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-16 11:17   ` gengdongjiu
@ 2018-01-22 19:32       ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-22 19:32 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm

Hi gengdongjiu,

On 16/01/18 11:17, gengdongjiu wrote:
> Hi James,
> 
> On 2018/1/16 3:38, James Morse wrote:
>> From: Xie XiuQi <xiexiuqi@huawei.com>
>>
>> ARM's v8.2 Extentions add support for Reliability, Availability and
>> Serviceability (RAS). On CPUs with these extensions system software
>> can use additional barriers to isolate errors and determine if faults
>> are pending. Add cpufeature detection.
>>
>> Platform level RAS support may require additional firmware support.

>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 664fadc2aa2e..1d51c8edf34b 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1062,6 +1062,22 @@ config ARM64_PMEM
>>  	  operations if DC CVAP is not supported (following the behaviour of
>>  	  DC CVAP itself if the system does not define a point of persistence).
>>  
>> +config ARM64_RAS_EXTN
>> +	bool "Enable support for RAS CPU Extensions"
>> +	default y
>> +	help
>> +	  CPUs that support the Reliability, Availability and Serviceability
>> +	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
>> +	  errors, classify them and report them to software.
>> +
>> +	  On CPUs with these extensions system software can use additional
>> +	  barriers to determine if faults are pending and read the
>> +	  classification from a new set of registers.
>> +
>> +	  Selecting this feature will allow the kernel to use these barriers
>> +	  and access the new registers if the system supports the extension.
>> +	  Platform RAS features may additionally depend on firmware support.
>> +
>>  endmenu
>>  
> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?

Sure. This series doesn't do a lot on its own, it expects firmware-first or
kernel-first support, which may in turn depend-on this feature. It means we
don't panic() when notified of corrected errors, until we get the
{firmware,kernel}-first support.

Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
picked up...)


James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-22 19:32       ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-22 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi gengdongjiu,

On 16/01/18 11:17, gengdongjiu wrote:
> Hi James,
> 
> On 2018/1/16 3:38, James Morse wrote:
>> From: Xie XiuQi <xiexiuqi@huawei.com>
>>
>> ARM's v8.2 Extentions add support for Reliability, Availability and
>> Serviceability (RAS). On CPUs with these extensions system software
>> can use additional barriers to isolate errors and determine if faults
>> are pending. Add cpufeature detection.
>>
>> Platform level RAS support may require additional firmware support.

>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 664fadc2aa2e..1d51c8edf34b 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1062,6 +1062,22 @@ config ARM64_PMEM
>>  	  operations if DC CVAP is not supported (following the behaviour of
>>  	  DC CVAP itself if the system does not define a point of persistence).
>>  
>> +config ARM64_RAS_EXTN
>> +	bool "Enable support for RAS CPU Extensions"
>> +	default y
>> +	help
>> +	  CPUs that support the Reliability, Availability and Serviceability
>> +	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
>> +	  errors, classify them and report them to software.
>> +
>> +	  On CPUs with these extensions system software can use additional
>> +	  barriers to determine if faults are pending and read the
>> +	  classification from a new set of registers.
>> +
>> +	  Selecting this feature will allow the kernel to use these barriers
>> +	  and access the new registers if the system supports the extension.
>> +	  Platform RAS features may additionally depend on firmware support.
>> +
>>  endmenu
>>  
> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?

Sure. This series doesn't do a lot on its own, it expects firmware-first or
kernel-first support, which may in turn depend-on this feature. It means we
don't panic() when notified of corrected errors, until we get the
{firmware,kernel}-first support.

Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
picked up...)


James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-22 19:32       ` James Morse
@ 2018-01-23  9:06         ` gengdongjiu
  -1 siblings, 0 replies; 60+ messages in thread
From: gengdongjiu @ 2018-01-23  9:06 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm

Hi James,

On 2018/1/23 3:32, James Morse wrote:
>> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
>> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
> Sure. This series doesn't do a lot on its own, it expects firmware-first or
> kernel-first support, which may in turn depend-on this feature. It means we
> don't panic() when notified of corrected errors, until we get the
> {firmware,kernel}-first support.
> 
> Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
> picked up...)

Now we should have supported firmware-first, do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
defconfig for ARM's SOC until kernel-first RAS is supported?

> 
> 
> James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-23  9:06         ` gengdongjiu
  0 siblings, 0 replies; 60+ messages in thread
From: gengdongjiu @ 2018-01-23  9:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On 2018/1/23 3:32, James Morse wrote:
>> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
>> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
> Sure. This series doesn't do a lot on its own, it expects firmware-first or
> kernel-first support, which may in turn depend-on this feature. It means we
> don't panic() when notified of corrected errors, until we get the
> {firmware,kernel}-first support.
> 
> Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
> picked up...)

Now we should have supported firmware-first, do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
defconfig for ARM's SOC until kernel-first RAS is supported?

> 
> 
> James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-22 18:18       ` James Morse
@ 2018-01-23 15:32         ` Christoffer Dall
  -1 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-23 15:32 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	Dongjiu Geng, kvmarm, linux-arm-kernel

On Mon, Jan 22, 2018 at 06:18:54PM +0000, James Morse wrote:
> Hi Christoffer,
> 
> On 19/01/18 19:20, Christoffer Dall wrote:
> > On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
> >> We expect to have firmware-first handling of RAS SErrors, with errors
> >> notified via an APEI method. For systems without firmware-first, add
> >> some minimal handling to KVM.
> >>
> >> There are two ways KVM can take an SError due to a guest, either may be a
> >> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> >> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> >>
> >> For SError that interrupt a guest and are routed to EL2 the existing
> >> behaviour is to inject an impdef SError into the guest.
> >>
> >> Add code to handle RAS SError based on the ESR. For uncontained and
> >> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
> >> errors compromise the host too. All other error types are contained:
> >> For the fatal errors the vCPU can't make progress, so we inject a virtual
> >> SError. We ignore contained errors where we can make progress as if
> >> we're lucky, we may not hit them again.
> >>
> >> If only some of the CPUs support RAS the guest will see the cpufeature
> >> sanitised version of the id registers, but we may still take RAS SError
> >> on this CPU. Move the SError handling out of handle_exit() into a new
> >> handler that runs before we can be preempted. This allows us to use
> >> this_cpu_has_cap(), via arm64_is_ras_serror().
> > 
> > Would it be possible to optimize this a bit later on by caching
> > this_cpu_has_cap() in vcpu_load() so that we can use a single
> > handle_exit function to process all exits?
> 
> If vcpu_load() prevents pre-emption between the guest-exit exception and the
> this_cpu_has_cap() test then we wouldn't need a separate handle_exit().

It doesn't, but you'd get another vcpu_put() / vcpu_load() if you get
preempted, and you could record anything you need to know about the CPU
that actually ran the guest in vcpu_put().

So it might be possible to call some "process pending serror" function
in vcpu_put().

> 
> But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
> unmask SError until we've fed the guest-exit:SError into the RAS code. This
> would also need the SError related handle_exit() calls to be separate/earlier.
> (there was some verbiage on this in the cover letter).

Yeah, I sort-of understood where this was going...

> 
> (I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
> kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
> put an end to that).

It's not clear to me exactly how that would work, as handle_exit() can
also block on stuff like allocating memory.  I suppose enabling
preemption could be per exit reason, but that might be hard to maintain.

> 
> 
> In terms of caching this_cpu_has_cap() value, is this due to a performance
> concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
> so we've already taken an SError out of the guest. Once its all put together
> we're likely to have a pending signal for user-space.
> 'Corrected' (or at least ignorable) errors are going to be the odd one out, I
> don't think we should worry about these!

The performance concern is having to call another function to check the
return value again in the critical path.  On older implementations this
kind of thing is actually measureable, and there's a tendency to add a
call here and a call there for any new aspect of the architecture, and
it will eventually weigh things down, I believe.  On the other hand,
having a "process some things before we enable preemption" which is your
handle_exit_early() function (could this also have been called
handle_exit_nopreempt() ?) is a potentially generally useful thing to
have and a reasonable thing overall.

Anyway, I was just trying to spitball a bit on the topic, no immediate
change required.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2018-01-23 15:32         ` Christoffer Dall
  0 siblings, 0 replies; 60+ messages in thread
From: Christoffer Dall @ 2018-01-23 15:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 22, 2018 at 06:18:54PM +0000, James Morse wrote:
> Hi Christoffer,
> 
> On 19/01/18 19:20, Christoffer Dall wrote:
> > On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
> >> We expect to have firmware-first handling of RAS SErrors, with errors
> >> notified via an APEI method. For systems without firmware-first, add
> >> some minimal handling to KVM.
> >>
> >> There are two ways KVM can take an SError due to a guest, either may be a
> >> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
> >> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
> >>
> >> For SError that interrupt a guest and are routed to EL2 the existing
> >> behaviour is to inject an impdef SError into the guest.
> >>
> >> Add code to handle RAS SError based on the ESR. For uncontained and
> >> uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
> >> errors compromise the host too. All other error types are contained:
> >> For the fatal errors the vCPU can't make progress, so we inject a virtual
> >> SError. We ignore contained errors where we can make progress as if
> >> we're lucky, we may not hit them again.
> >>
> >> If only some of the CPUs support RAS the guest will see the cpufeature
> >> sanitised version of the id registers, but we may still take RAS SError
> >> on this CPU. Move the SError handling out of handle_exit() into a new
> >> handler that runs before we can be preempted. This allows us to use
> >> this_cpu_has_cap(), via arm64_is_ras_serror().
> > 
> > Would it be possible to optimize this a bit later on by caching
> > this_cpu_has_cap() in vcpu_load() so that we can use a single
> > handle_exit function to process all exits?
> 
> If vcpu_load() prevents pre-emption between the guest-exit exception and the
> this_cpu_has_cap() test then we wouldn't need a separate handle_exit().

It doesn't, but you'd get another vcpu_put() / vcpu_load() if you get
preempted, and you could record anything you need to know about the CPU
that actually ran the guest in vcpu_put().

So it might be possible to call some "process pending serror" function
in vcpu_put().

> 
> But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
> unmask SError until we've fed the guest-exit:SError into the RAS code. This
> would also need the SError related handle_exit() calls to be separate/earlier.
> (there was some verbiage on this in the cover letter).

Yeah, I sort-of understood where this was going...

> 
> (I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
> kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
> put an end to that).

It's not clear to me exactly how that would work, as handle_exit() can
also block on stuff like allocating memory.  I suppose enabling
preemption could be per exit reason, but that might be hard to maintain.

> 
> 
> In terms of caching this_cpu_has_cap() value, is this due to a performance
> concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
> so we've already taken an SError out of the guest. Once its all put together
> we're likely to have a pending signal for user-space.
> 'Corrected' (or at least ignorable) errors are going to be the odd one out, I
> don't think we should worry about these!

The performance concern is having to call another function to check the
return value again in the critical path.  On older implementations this
kind of thing is actually measureable, and there's a tendency to add a
call here and a call there for any new aspect of the architecture, and
it will eventually weigh things down, I believe.  On the other hand,
having a "process some things before we enable preemption" which is your
handle_exit_early() function (could this also have been called
handle_exit_nopreempt() ?) is a potentially generally useful thing to
have and a reasonable thing overall.

Anyway, I was just trying to spitball a bit on the topic, no immediate
change required.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-23  9:06         ` gengdongjiu
@ 2018-01-23 19:05           ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-23 19:05 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm

Hi gengdongjiu,

On 23/01/18 09:06, gengdongjiu wrote:
> On 2018/1/23 3:32, James Morse wrote:
>>> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
>>> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
>> Sure. This series doesn't do a lot on its own, it expects firmware-first or
>> kernel-first support, which may in turn depend-on this feature. It means we
>> don't panic() when notified of corrected errors, until we get the
>> {firmware,kernel}-first support.
>>
>> Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
>> picked up...)
> 
> Now we should have supported firmware-first,

For NOTIFY_SEI? We don't have that yet.
This series was about the minimal handling for systems with neither firmware or
kernel first handling. This stops us panic()ing on corrected errors.
It also enables IESB which benefits firmware first handling using the
notification types we already support. (SEA, POLL, IRQ etc)

>From here we can add KVM APIs, firmware-first notification support and
kernel-first support as independent series.


> do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
> defconfig for ARM's SOC until kernel-first RAS is supported?

I've no idea if or when we will do kernel-first, when I bring it up, its so we
don't build a hybrid model, and we consider how we going to add kernel-first
support if/when it comes along.

If you want it turned on in defconfig please submit a patch to do that. I
haven't because I don't know where they go!


Thanks,

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-23 19:05           ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-23 19:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi gengdongjiu,

On 23/01/18 09:06, gengdongjiu wrote:
> On 2018/1/23 3:32, James Morse wrote:
>>> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the "arch/arm64/configs/defconfig",
>>> if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
>> Sure. This series doesn't do a lot on its own, it expects firmware-first or
>> kernel-first support, which may in turn depend-on this feature. It means we
>> don't panic() when notified of corrected errors, until we get the
>> {firmware,kernel}-first support.
>>
>> Don't defconfig changes get collected by arm-soc? (I'm not sure how these get
>> picked up...)
> 
> Now we should have supported firmware-first,

For NOTIFY_SEI? We don't have that yet.
This series was about the minimal handling for systems with neither firmware or
kernel first handling. This stops us panic()ing on corrected errors.
It also enables IESB which benefits firmware first handling using the
notification types we already support. (SEA, POLL, IRQ etc)

>From here we can add KVM APIs, firmware-first notification support and
kernel-first support as independent series.


> do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
> defconfig for ARM's SOC until kernel-first RAS is supported?

I've no idea if or when we will do kernel-first, when I bring it up, its so we
don't build a hybrid model, and we consider how we going to add kernel-first
support if/when it comes along.

If you want it turned on in defconfig please submit a patch to do that. I
haven't because I don't know where they go!


Thanks,

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
  2018-01-23 19:05           ` James Morse
@ 2018-01-25  8:27             ` gengdongjiu
  -1 siblings, 0 replies; 60+ messages in thread
From: gengdongjiu @ 2018-01-25  8:27 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm

On 2018/1/24 3:05, James Morse wrote:
>> do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
>> defconfig for ARM's SOC until kernel-first RAS is supported?
> I've no idea if or when we will do kernel-first, when I bring it up, its so we
> don't build a hybrid model, and we consider how we going to add kernel-first
> support if/when it comes along.
> 
> If you want it turned on in defconfig please submit a patch to do that. I
sorry, it is my mistake,
In your patch, ARM64_RAS_EXTN is already turned on by default, so no need to turn on again.

+config ARM64_RAS_EXTN
+       bool "Enable support for RAS CPU Extensions"
+       default y
+       help
+         CPUs that support the Reliability, Availability and Serviceability


> haven't because I don't know where they go!

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-25  8:27             ` gengdongjiu
  0 siblings, 0 replies; 60+ messages in thread
From: gengdongjiu @ 2018-01-25  8:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018/1/24 3:05, James Morse wrote:
>> do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the
>> defconfig for ARM's SOC until kernel-first RAS is supported?
> I've no idea if or when we will do kernel-first, when I bring it up, its so we
> don't build a hybrid model, and we consider how we going to add kernel-first
> support if/when it comes along.
> 
> If you want it turned on in defconfig please submit a patch to do that. I
sorry, it is my mistake,
In your patch, ARM64_RAS_EXTN is already turned on by default, so no need to turn on again.

+config ARM64_RAS_EXTN
+       bool "Enable support for RAS CPU Extensions"
+       default y
+       help
+         CPUs that support the Reliability, Availability and Serviceability


> haven't because I don't know where they go!

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  2018-01-23 15:32         ` Christoffer Dall
@ 2018-01-30 19:18           ` James Morse
  -1 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-30 19:18 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Jonathan.Zhang, Xie XiuQi, Marc Zyngier, Catalin Marinas,
	Suzuki K Poulose, Will Deacon, Dongjiu Geng, kvmarm,
	linux-arm-kernel

Hi Christoffer,

On 23/01/18 15:32, Christoffer Dall wrote:
> On Mon, Jan 22, 2018 at 06:18:54PM +0000, James Morse wrote:
>> On 19/01/18 19:20, Christoffer Dall wrote:
>>> On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
>>>> If only some of the CPUs support RAS the guest will see the cpufeature
>>>> sanitised version of the id registers, but we may still take RAS SError
>>>> on this CPU. Move the SError handling out of handle_exit() into a new
>>>> handler that runs before we can be preempted. This allows us to use
>>>> this_cpu_has_cap(), via arm64_is_ras_serror().
>>>
>>> Would it be possible to optimize this a bit later on by caching
>>> this_cpu_has_cap() in vcpu_load() so that we can use a single
>>> handle_exit function to process all exits?
>>
>> If vcpu_load() prevents pre-emption between the guest-exit exception and the
>> this_cpu_has_cap() test then we wouldn't need a separate handle_exit().
> 
> It doesn't, but you'd get another vcpu_put() / vcpu_load() if you get
> preempted, and you could record anything you need to know about the CPU
> that actually ran the guest in vcpu_put().

Snazzy!

> So it might be possible to call some "process pending serror" function
> in vcpu_put().

Hmm, maybe. When we exit the guest its because we've had a notification an error
occurred, but we don't know what/where yet. The case that worries me is we
reschedule() onto some other affected task, and it gets notification of the
error too.

For notifications signalled by an SError I'd like to feed them into the RAS
machinery before we unmask SError on the host, so that the first error is
handled first. Otherwise KVM has to eyeball the SError ESR and guess as to
whether the host is affected by the error, before re-enabling preemption on the
grounds it 'probably only affects this guest'.


>> But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
>> unmask SError until we've fed the guest-exit:SError into the RAS code. This
>> would also need the SError related handle_exit() calls to be separate/earlier.
>> (there was some verbiage on this in the cover letter).
> 
> Yeah, I sort-of understood where this was going...

(sorry, I assume not everyone reads the cover letter!)


>> (I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
>> kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
>> put an end to that).
> 
> It's not clear to me exactly how that would work, as handle_exit() can
> also block on stuff like allocating memory.  

Yes, it was a dead end. I figured two handle_exit()s was a bit ugly, I assumed
you were asking about moving back to a single handle_exit()...


> I suppose enabling
> preemption could be per exit reason, but that might be hard to maintain.


>> In terms of caching this_cpu_has_cap() value, is this due to a performance
>> concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
>> so we've already taken an SError out of the guest. Once its all put together
>> we're likely to have a pending signal for user-space.
>> 'Corrected' (or at least ignorable) errors are going to be the odd one out, I
>> don't think we should worry about these!
> 
> The performance concern is having to call another function to check the
> return value again in the critical path.

My justification for this sort of thing has been we've taken an SError, we may
panic() the host if its uncontained. Provided there is no extra cost on the 'no
SError' path, I don't think the 'we've taken an SError' paths need to be fast.


> On older implementations this
> kind of thing is actually measureable, and there's a tendency to add a
> call here and a call there for any new aspect of the architecture, and
> it will eventually weigh things down, I believe.

I'll keep this in mind.


> On the other hand,having a "process some things before we enable preemption"
>  which is your handle_exit_early() function (could this also have been called
> handle_exit_nopreempt() ?)

Yes, and that would have been a better name!


Thanks,

James


> is a potentially generally useful thing to
> have and a reasonable thing overall.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit
@ 2018-01-30 19:18           ` James Morse
  0 siblings, 0 replies; 60+ messages in thread
From: James Morse @ 2018-01-30 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 23/01/18 15:32, Christoffer Dall wrote:
> On Mon, Jan 22, 2018 at 06:18:54PM +0000, James Morse wrote:
>> On 19/01/18 19:20, Christoffer Dall wrote:
>>> On Mon, Jan 15, 2018 at 07:39:04PM +0000, James Morse wrote:
>>>> If only some of the CPUs support RAS the guest will see the cpufeature
>>>> sanitised version of the id registers, but we may still take RAS SError
>>>> on this CPU. Move the SError handling out of handle_exit() into a new
>>>> handler that runs before we can be preempted. This allows us to use
>>>> this_cpu_has_cap(), via arm64_is_ras_serror().
>>>
>>> Would it be possible to optimize this a bit later on by caching
>>> this_cpu_has_cap() in vcpu_load() so that we can use a single
>>> handle_exit function to process all exits?
>>
>> If vcpu_load() prevents pre-emption between the guest-exit exception and the
>> this_cpu_has_cap() test then we wouldn't need a separate handle_exit().
> 
> It doesn't, but you'd get another vcpu_put() / vcpu_load() if you get
> preempted, and you could record anything you need to know about the CPU
> that actually ran the guest in vcpu_put().

Snazzy!

> So it might be possible to call some "process pending serror" function
> in vcpu_put().

Hmm, maybe. When we exit the guest its because we've had a notification an error
occurred, but we don't know what/where yet. The case that worries me is we
reschedule() onto some other affected task, and it gets notification of the
error too.

For notifications signalled by an SError I'd like to feed them into the RAS
machinery before we unmask SError on the host, so that the first error is
handled first. Otherwise KVM has to eyeball the SError ESR and guess as to
whether the host is affected by the error, before re-enabling preemption on the
grounds it 'probably only affects this guest'.


>> But, if we support kernel-first RAS or firmware-first's NOTIFY_SEI we shouldn't
>> unmask SError until we've fed the guest-exit:SError into the RAS code. This
>> would also need the SError related handle_exit() calls to be separate/earlier.
>> (there was some verbiage on this in the cover letter).
> 
> Yeah, I sort-of understood where this was going...

(sorry, I assume not everyone reads the cover letter!)


>> (I started down the 'make handle_exit() non-preemptible', but WF{E,I}'s
>> kvm_vcpu_block()->schedule() and kvm_vcpu_on_spin()s use of kvm_vcpu_yield_to()
>> put an end to that).
> 
> It's not clear to me exactly how that would work, as handle_exit() can
> also block on stuff like allocating memory.  

Yes, it was a dead end. I figured two handle_exit()s was a bit ugly, I assumed
you were asking about moving back to a single handle_exit()...


> I suppose enabling
> preemption could be per exit reason, but that might be hard to maintain.


>> In terms of caching this_cpu_has_cap() value, is this due to a performance
>> concern? It's all called behind 'exception_index == ARM_EXCEPTION_EL1_SERROR',
>> so we've already taken an SError out of the guest. Once its all put together
>> we're likely to have a pending signal for user-space.
>> 'Corrected' (or at least ignorable) errors are going to be the odd one out, I
>> don't think we should worry about these!
> 
> The performance concern is having to call another function to check the
> return value again in the critical path.

My justification for this sort of thing has been we've taken an SError, we may
panic() the host if its uncontained. Provided there is no extra cost on the 'no
SError' path, I don't think the 'we've taken an SError' paths need to be fast.


> On older implementations this
> kind of thing is actually measureable, and there's a tendency to add a
> call here and a call there for any new aspect of the architecture, and
> it will eventually weigh things down, I believe.

I'll keep this in mind.


> On the other hand,having a "process some things before we enable preemption"
>  which is your handle_exit_early() function (could this also have been called
> handle_exit_nopreempt() ?)

Yes, and that would have been a better name!


Thanks,

James


> is a potentially generally useful thing to
> have and a reasonable thing overall.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions
@ 2018-01-24 13:53 gengdongjiu
  0 siblings, 0 replies; 60+ messages in thread
From: gengdongjiu @ 2018-01-24 13:53 UTC (permalink / raw)
  To: James Morse
  Cc: Jonathan.Zhang, Marc Zyngier, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm

Hi James,
   Thanks for this mail.

> Hi gengdongjiu,
> 
> On 23/01/18 09:06, gengdongjiu wrote:
> > On 2018/1/23 3:32, James Morse wrote:
> >>> it seems this "CONFIG_ARM64_RAS_EXTN" is not enabled in the
> >>> "arch/arm64/configs/defconfig", if not, I want to enable this config to enable RAS feature in the defconfig. do you agree?
> >> Sure. This series doesn't do a lot on its own, it expects
> >> firmware-first or kernel-first support, which may in turn depend-on
> >> this feature. It means we don't panic() when notified of corrected
> >> errors, until we get the {firmware,kernel}-first support.
> >>
> >> Don't defconfig changes get collected by arm-soc? (I'm not sure how
> >> these get picked up...)
> >
> > Now we should have supported firmware-first,
> 
> For NOTIFY_SEI? We don't have that yet.

For NOTIFY_SEI, it is not yet.
I mean NOTIFY_SEA/ NOTIFY_GSIV/ NOTIFY_GPIO with firmware-first which have been enabled before.

> This series was about the minimal handling for systems with neither firmware or kernel first handling. This stops us panic()ing on corrected
> errors.
> It also enables IESB which benefits firmware first handling using the notification types we already support. (SEA, POLL, IRQ etc)
> 
> From here we can add KVM APIs, firmware-first notification support and kernel-first support as independent series.

Yes, agree that. I added a KVM API which can be as independent series

> 
> 
> > do you mean we do not enable "CONFIG_ARM64_RAS_EXTN" in the defconfig
> > for ARM's SOC until kernel-first RAS is supported?
> 
> I've no idea if or when we will do kernel-first, when I bring it up, its so we don't build a hybrid model, and we consider how we going to add
> kernel-first support if/when it comes along.

Understanding, frankly speaking, I think we mainly use the firmware-first solution for the ARM SOC.

> 
> If you want it turned on in defconfig please submit a patch to do that. I haven't because I don't know where they go!

Ok, tomorrow I will submit a patch to turn on it so that it can benefit firmware first handling, KVM API is also dependent on this configuration.

> 
> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2018-01-30 19:18 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-15 19:38 [PATCH v6 00/13] arm64/KVM: RAS & IESB for firmware first support James Morse
2018-01-15 19:38 ` James Morse
2018-01-15 19:38 ` [PATCH v6 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early James Morse
2018-01-15 19:38   ` James Morse
2018-01-16  9:51   ` Marc Zyngier
2018-01-16 15:04   ` Catalin Marinas
2018-01-16 15:09     ` Suzuki K Poulose
2018-01-15 19:38 ` [PATCH v6 02/13] arm64: sysreg: Move to use definitions for all the SCTLR bits James Morse
2018-01-15 19:38   ` James Morse
2018-01-15 19:38 ` [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2018-01-15 19:38   ` James Morse
2018-01-16 10:26   ` Suzuki K Poulose
2018-01-16 11:17   ` gengdongjiu
2018-01-22 19:32     ` James Morse
2018-01-22 19:32       ` James Morse
2018-01-23  9:06       ` gengdongjiu
2018-01-23  9:06         ` gengdongjiu
2018-01-23 19:05         ` James Morse
2018-01-23 19:05           ` James Morse
2018-01-25  8:27           ` gengdongjiu
2018-01-25  8:27             ` gengdongjiu
2018-01-15 19:38 ` [PATCH v6 04/13] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2018-01-15 19:38   ` James Morse
2018-01-15 19:38 ` [PATCH v6 05/13] arm64: Unconditionally enable IESB on exception entry/return for firmware-first James Morse
2018-01-15 19:38   ` James Morse
2018-01-16  9:55   ` Marc Zyngier
2018-01-15 19:38 ` [PATCH v6 06/13] arm64: kernel: Prepare for a DISR user James Morse
2018-01-15 19:38   ` James Morse
2018-01-16 11:11   ` Suzuki K Poulose
2018-01-15 19:39 ` [PATCH v6 07/13] KVM: arm/arm64: mask/unmask daif around VHE guests James Morse
2018-01-15 19:39   ` James Morse
2018-01-16 10:01   ` Marc Zyngier
2018-01-15 19:39 ` [PATCH v6 08/13] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2018-01-15 19:39   ` James Morse
2018-01-16 10:05   ` Marc Zyngier
2018-01-15 19:39 ` [PATCH v6 09/13] KVM: arm64: Save/Restore guest DISR_EL1 James Morse
2018-01-15 19:39   ` James Morse
2018-01-15 19:39 ` [PATCH v6 10/13] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2018-01-15 19:39   ` James Morse
2018-01-16  9:41   ` Marc Zyngier
2018-01-15 19:39 ` [PATCH v6 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit James Morse
2018-01-15 19:39   ` James Morse
2018-01-16  9:29   ` Marc Zyngier
2018-01-19 19:20   ` Christoffer Dall
2018-01-19 19:20     ` Christoffer Dall
2018-01-22 18:18     ` James Morse
2018-01-22 18:18       ` James Morse
2018-01-23 15:32       ` Christoffer Dall
2018-01-23 15:32         ` Christoffer Dall
2018-01-30 19:18         ` James Morse
2018-01-30 19:18           ` James Morse
2018-01-15 19:39 ` [PATCH v6 12/13] KVM: arm64: Handle RAS SErrors from EL2 " James Morse
2018-01-15 19:39   ` James Morse
2018-01-16  9:36   ` Marc Zyngier
2018-01-19 19:54   ` Christoffer Dall
2018-01-19 19:54     ` Christoffer Dall
2018-01-15 19:39 ` [PATCH v6 13/13] KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA James Morse
2018-01-15 19:39   ` James Morse
2018-01-16 17:36 ` [PATCH v6 00/13] arm64/KVM: RAS & IESB for firmware first support Catalin Marinas
2018-01-24 13:53 [PATCH v6 03/13] arm64: cpufeature: Detect CPU RAS Extentions gengdongjiu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.