linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] arm64: KVM: debug infrastructure support
@ 2014-06-20 12:59 Marc Zyngier
  2014-06-20 12:59 ` [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi Marc Zyngier
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

This patch series adds debug support, a key feature missing from the
KVM/arm64 port.

The main idea is to keep track of whether the debug registers are
"dirty" (changed by the guest) or not. In this case, perform the usual
save/restore dance, for one run only. It means we only have a penalty
if a guest is actively using the debug registers.

The amount of registers is properly frightening, but CPUs actually
only implement a subset of them. Also, there is a number of registers
we don't bother emulating (things having to do with external debug and
OSlock).

External debug is when you actually plug a physical JTAG into the CPU.
OSlock is a way to prevent "other software" to play with the debug
registers. My understanding is that it is only useful in combination
with the external debug. In both case, implementing support for this
is probably not worth the effort, at least for the time being.

This has been tested on a Cortex-A53/A57 platform, running both 32 and
64bit guests, on top of 3.16-rc1. This code also lives in my tree in
the kvm-arm64/debug-trap branch.

>From v2 [2]:
- Fixed a number of very stupid bugs in the macros generating the trap
  entries
- Added some documentation explaining why we don't bother emulating
  external debug and the OSlock stuff
- Other bits of documentation here and there

>From v1 [1]:
- Renamed trap_wi_raz to trap_raz_wi
- Renamed skip_clean_debug_state to skip_debug_state
- Simplified debug state computing, moved to its own macro
- Added some comment to make the logic more obvious

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009332.html
[2]: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009534.html

Marc Zyngier (9):
  arm64: KVM: rename pm_fake handler to trap_raz_wi
  arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  arm64: KVM: add trap handlers for AArch64 debug registers
  arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  arm64: KVM: check ordering of all system register tables
  arm64: KVM: add trap handlers for AArch32 debug registers
  arm64: KVM: implement lazy world switch for debug registers
  arm64: KVM: enable trapping of all debug registers

 arch/arm64/include/asm/debug-monitors.h |  19 +-
 arch/arm64/include/asm/kvm_asm.h        |  39 ++-
 arch/arm64/include/asm/kvm_coproc.h     |   3 +-
 arch/arm64/include/asm/kvm_host.h       |  12 +-
 arch/arm64/kernel/asm-offsets.c         |   1 +
 arch/arm64/kernel/debug-monitors.c      |   9 -
 arch/arm64/kvm/handle_exit.c            |   4 +-
 arch/arm64/kvm/hyp.S                    | 470 ++++++++++++++++++++++++++++-
 arch/arm64/kvm/sys_regs.c               | 520 ++++++++++++++++++++++++++++----
 9 files changed, 978 insertions(+), 99 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
@ 2014-06-20 12:59 ` Marc Zyngier
  2014-07-09  9:27   ` Christoffer Dall
  2014-06-20 13:00 ` [PATCH v3 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h Marc Zyngier
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 83 ++++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c59a1bd..4abd84e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -163,18 +163,9 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
 	return true;
 }
 
-/*
- * We could trap ID_DFR0 and tell the guest we don't support performance
- * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
- * NAKed, so it will read the PMCR anyway.
- *
- * Therefore we tell the guest we have 0 counters.  Unfortunately, we
- * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
- * all PM registers, which doesn't crash the guest kernel at least.
- */
-static bool pm_fake(struct kvm_vcpu *vcpu,
-		    const struct sys_reg_params *p,
-		    const struct sys_reg_desc *r)
+static bool trap_raz_wi(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
 {
 	if (p->is_write)
 		return ignore_write(vcpu, p);
@@ -201,6 +192,17 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * We could trap ID_DFR0 and tell the guest we don't support performance
+ * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
+ * NAKed, so it will read the PMCR anyway.
+ *
+ * Therefore we tell the guest we have 0 counters.  Unfortunately, we
+ * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
+ * all PM registers, which doesn't crash the guest kernel at least.
+ *
+ * Same goes for the whole debug infrastructure, which probably breaks
+ * some guest functionnality. This should be fixed.
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -260,10 +262,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMINTENSET_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMINTENCLR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b010),
-	  pm_fake },
+	  trap_raz_wi },
 
 	/* MAIR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000),
@@ -292,43 +294,43 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMCR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b000),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMCNTENSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b001),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMCNTENCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b010),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMOVSCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b011),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMSWINC_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b100),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMSELR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b101),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMCEID0_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b110),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMCEID1_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b111),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMCCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b000),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMXEVTYPER_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b001),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMXEVCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b010),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMUSERENR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b000),
-	  pm_fake },
+	  trap_raz_wi },
 	/* PMOVSSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b011),
-	  pm_fake },
+	  trap_raz_wi },
 
 	/* TPIDR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b0000), Op2(0b010),
@@ -374,19 +376,20 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw },
 	{ Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw },
 
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), pm_fake },
+	/* PMU */
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), trap_raz_wi },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), trap_raz_wi },
 
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 0), access_vm_reg, NULL, c10_PRRR },
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 1), access_vm_reg, NULL, c10_NMRR },
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
  2014-06-20 12:59 ` [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers Marc Zyngier
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

In order to be able to use the DBG_MDSCR_* macros from the KVM code,
move the relevant definitions to the obvious include file.

Also move the debug_el enum to a portion of the file that is guarded
by #ifndef __ASSEMBLY__ in order to use that file from assembly code.

Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/debug-monitors.h | 19 ++++++++++++++-----
 arch/arm64/kernel/debug-monitors.c      |  9 ---------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 6e9b5b3..7fb3437 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -18,6 +18,15 @@
 
 #ifdef __KERNEL__
 
+/* Low-level stepping controls. */
+#define DBG_MDSCR_SS		(1 << 0)
+#define DBG_SPSR_SS		(1 << 21)
+
+/* MDSCR_EL1 enabling bits */
+#define DBG_MDSCR_KDE		(1 << 13)
+#define DBG_MDSCR_MDE		(1 << 15)
+#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
+
 #define	DBG_ESR_EVT(x)		(((x) >> 27) & 0x7)
 
 /* AArch64 */
@@ -73,11 +82,6 @@
 
 #define CACHE_FLUSH_IS_SAFE		1
 
-enum debug_el {
-	DBG_ACTIVE_EL0 = 0,
-	DBG_ACTIVE_EL1,
-};
-
 /* AArch32 */
 #define DBG_ESR_EVT_BKPT	0x4
 #define DBG_ESR_EVT_VECC	0x5
@@ -115,6 +119,11 @@ void unregister_break_hook(struct break_hook *hook);
 
 u8 debug_monitors_arch(void);
 
+enum debug_el {
+	DBG_ACTIVE_EL0 = 0,
+	DBG_ACTIVE_EL1,
+};
+
 void enable_debug_monitors(enum debug_el el);
 void disable_debug_monitors(enum debug_el el);
 
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index a7fb874..e022f87 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -30,15 +30,6 @@
 #include <asm/cputype.h>
 #include <asm/system_misc.h>
 
-/* Low-level stepping controls. */
-#define DBG_MDSCR_SS		(1 << 0)
-#define DBG_SPSR_SS		(1 << 21)
-
-/* MDSCR_EL1 enabling bits */
-#define DBG_MDSCR_KDE		(1 << 13)
-#define DBG_MDSCR_MDE		(1 << 15)
-#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
-
 /* Determine debug architecture. */
 u8 debug_monitors_arch(void)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
  2014-06-20 12:59 ` [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-07-09  9:38   ` Christoffer Dall
  2014-06-20 13:00 ` [PATCH v3 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15 Marc Zyngier
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the AArch64 debug registers that are accessible
from EL0 or EL1. The trapping code keeps track of the state of the
debug registers, allowing for the switch code to implement a lazy
switching strategy.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
 arch/arm64/include/asm/kvm_host.h |   3 +
 arch/arm64/kvm/sys_regs.c         | 137 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 159 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9fcd54b..e6b159a 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -43,14 +43,25 @@
 #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
 #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
 #define	PAR_EL1		21	/* Physical Address Register */
+#define MDSCR_EL1	22	/* Monitor Debug System Control Register */
+#define DBGBCR0_EL1	23	/* Debug Breakpoint Control Registers (0-15) */
+#define DBGBCR15_EL1	38
+#define DBGBVR0_EL1	39	/* Debug Breakpoint Value Registers (0-15) */
+#define DBGBVR15_EL1	54
+#define DBGWCR0_EL1	55	/* Debug Watchpoint Control Registers (0-15) */
+#define DBGWCR15_EL1	70
+#define DBGWVR0_EL1	71	/* Debug Watchpoint Value Registers (0-15) */
+#define DBGWVR15_EL1	86
+#define MDCCINT_EL1	87	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+
 /* 32bit specific registers. Keep them at the end of the range */
-#define	DACR32_EL2	22	/* Domain Access Control Register */
-#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
-#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
-#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
-#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
-#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
-#define	NR_SYS_REGS	28
+#define	DACR32_EL2	88	/* Domain Access Control Register */
+#define	IFSR32_EL2	89	/* Instruction Fault Status Register */
+#define	FPEXC32_EL2	90	/* Floating-Point Exception Control Register */
+#define	DBGVCR32_EL2	91	/* Debug Vector Catch Register */
+#define	TEECR32_EL1	92	/* ThumbEE Configuration Register */
+#define	TEEHBR32_EL1	93	/* ThumbEE Handler Base Register */
+#define	NR_SYS_REGS	94
 
 /* 32bit mapping */
 #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
@@ -87,6 +98,9 @@
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
 
+#define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
+#define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 92242ce..79573c86 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
 	/* Exception Information */
 	struct kvm_vcpu_fault_info fault;
 
+	/* Debug state */
+	u64 debug_flags;
+
 	/* Pointer to host CPU context */
 	kvm_cpu_context_t *host_cpu_context;
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 4abd84e..808e3b2 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -30,6 +30,7 @@
 #include <asm/kvm_mmu.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
+#include <asm/debug-monitors.h>
 #include <trace/events/kvm.h>
 
 #include "sys_regs.h"
@@ -173,6 +174,60 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 		return read_zero(vcpu, p);
 }
 
+static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
+			   const struct sys_reg_params *p,
+			   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = (1 << 3);
+		return true;
+	}
+}
+
+static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
+				   const struct sys_reg_params *p,
+				   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		u32 val;
+		asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
+		*vcpu_reg(vcpu, p->Rt) = val;
+		return true;
+	}
+}
+
+/*
+ * We want to avoid world-switching all the DBG registers all the
+ * time. For this, we use a DIRTY but, indicating the guest has
+ * modified the debug registers, and only restore the registers once,
+ * disabling traps.
+ *
+ * The best thing to do would be to trap MDSCR_EL1 independently, test
+ * if DBG_MDSCR_KDE or DBG_MDSCR_MDE is getting set, and only set the
+ * DIRTY bit in that case.
+ *
+ * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1 like a
+ * woodpecker on a tree, and it is better to disable trapping as soon
+ * as possible in this case. Some day, make this a tuneable...
+ */
+static bool trap_debug_regs(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_params *p,
+			    const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+	}
+
+	return true;
+}
+
 static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
 	u64 amair;
@@ -189,6 +244,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
 }
 
+/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
+#define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
+	/* DBGBVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),	\
+	  trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },		\
+	/* DBGBCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),	\
+	  trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },		\
+	/* DBGWVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),	\
+	  trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 },		\
+	/* DBGWCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),	\
+	  trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 }
+
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
@@ -201,8 +271,12 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
  * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
  * all PM registers, which doesn't crash the guest kernel at least.
  *
- * Same goes for the whole debug infrastructure, which probably breaks
- * some guest functionnality. This should be fixed.
+ * Debug handling: We do trap most, if not all debug related system
+ * registers. The implementation is good enough to ensure that a guest
+ * can use these with minimal performance degradation. The drawback is
+ * that we don't implement any of the external debug, none of the
+ * OSlock protocol. This should be revisited if we ever encounter a
+ * more demanding guest...
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -215,12 +289,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
 	  access_dcsw },
 
+	DBG_BCR_BVR_WCR_WVR_EL1(0),
+	DBG_BCR_BVR_WCR_WVR_EL1(1),
+	/* MDCCINT_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
+	  trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
+	/* MDSCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
+	  trap_debug_regs, reset_val, MDSCR_EL1, 0 },
+	DBG_BCR_BVR_WCR_WVR_EL1(2),
+	DBG_BCR_BVR_WCR_WVR_EL1(3),
+	DBG_BCR_BVR_WCR_WVR_EL1(4),
+	DBG_BCR_BVR_WCR_WVR_EL1(5),
+	DBG_BCR_BVR_WCR_WVR_EL1(6),
+	DBG_BCR_BVR_WCR_WVR_EL1(7),
+	DBG_BCR_BVR_WCR_WVR_EL1(8),
+	DBG_BCR_BVR_WCR_WVR_EL1(9),
+	DBG_BCR_BVR_WCR_WVR_EL1(10),
+	DBG_BCR_BVR_WCR_WVR_EL1(11),
+	DBG_BCR_BVR_WCR_WVR_EL1(12),
+	DBG_BCR_BVR_WCR_WVR_EL1(13),
+	DBG_BCR_BVR_WCR_WVR_EL1(14),
+	DBG_BCR_BVR_WCR_WVR_EL1(15),
+
+	/* MDRAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
+	  trap_raz_wi },
+	/* OSLAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
+	  trap_raz_wi },
+	/* OSLSR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
+	  trap_oslsr_el1 },
+	/* OSDLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
+	  trap_raz_wi },
+	/* DBGPRCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
+	  trap_raz_wi },
+	/* DBGCLAIMSET_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
+	  trap_raz_wi },
+	/* DBGCLAIMCLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
+	  trap_raz_wi },
+	/* DBGAUTHSTATUS_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
+	  trap_dbgauthstatus_el1 },
+
 	/* TEECR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEECR32_EL1, 0 },
 	/* TEEHBR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEEHBR32_EL1, 0 },
+
+	/* MDCCSR_EL1 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
+	  trap_raz_wi },
+	/* DBGDTR_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
+	  trap_raz_wi },
+	/* DBGDTR[TR]X_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
+	  trap_raz_wi },
+
 	/* DBGVCR32_EL2 */
 	{ Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
 	  NULL, reset_val, DBGVCR32_EL2, 0 },
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (2 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps Marc Zyngier
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h    |   2 +-
 arch/arm64/include/asm/kvm_coproc.h |   3 +-
 arch/arm64/include/asm/kvm_host.h   |   9 ++-
 arch/arm64/kvm/handle_exit.c        |   4 +-
 arch/arm64/kvm/sys_regs.c           | 133 +++++++++++++++++++++++++++++-------
 5 files changed, 122 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e6b159a..12f9dd7 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,7 +93,7 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
-#define NR_CP15_REGS	(NR_SYS_REGS * 2)
+#define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
diff --git a/arch/arm64/include/asm/kvm_coproc.h b/arch/arm64/include/asm/kvm_coproc.h
index 9a59301..0b52377 100644
--- a/arch/arm64/include/asm/kvm_coproc.h
+++ b/arch/arm64/include/asm/kvm_coproc.h
@@ -39,7 +39,8 @@ void kvm_register_target_sys_reg_table(unsigned int target,
 				       struct kvm_sys_reg_target_table *table);
 
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 79573c86..108a297 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -86,7 +86,7 @@ struct kvm_cpu_context {
 	struct kvm_regs	gp_regs;
 	union {
 		u64 sys_regs[NR_SYS_REGS];
-		u32 cp15[NR_CP15_REGS];
+		u32 copro[NR_COPRO_REGS];
 	};
 };
 
@@ -141,7 +141,12 @@ struct kvm_vcpu_arch {
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 #define vcpu_sys_reg(v,r)	((v)->arch.ctxt.sys_regs[(r)])
-#define vcpu_cp15(v,r)		((v)->arch.ctxt.cp15[(r)])
+/*
+ * CP14 and CP15 live in the same array, as they are backed by the
+ * same system registers.
+ */
+#define vcpu_cp14(v,r)		((v)->arch.ctxt.copro[(r)])
+#define vcpu_cp15(v,r)		((v)->arch.ctxt.copro[(r)])
 
 struct kvm_vm_stat {
 	u32 remote_tlb_flush;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 182415e..e28be51 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -73,9 +73,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
 	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
 	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
-	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[ESR_EL2_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_64,
 	[ESR_EL2_EC_HVC32]	= handle_hvc,
 	[ESR_EL2_EC_SMC32]	= handle_smc,
 	[ESR_EL2_EC_HVC64]	= handle_hvc,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 808e3b2..fb6eece 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -483,6 +483,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+/* Trapped cp14 registers */
+static const struct sys_reg_desc cp14_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
@@ -590,26 +594,29 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
-static void emulate_cp15(struct kvm_vcpu *vcpu,
-			 const struct sys_reg_params *params)
+/*
+ * emulate_cp --  tries to match a sys_reg access in a handling table, and
+ *                call the corresponding trap handler.
+ *
+ * @params: pointer to the descriptor of the access
+ * @table: array of trap descriptors
+ * @num: size of the trap descriptor array
+ *
+ * Return 0 if the access has been handled, and -1 if not.
+ */
+static int emulate_cp(struct kvm_vcpu *vcpu,
+		      const struct sys_reg_params *params,
+		      const struct sys_reg_desc *table,
+		      size_t num)
 {
-	size_t num;
-	const struct sys_reg_desc *table, *r;
+	const struct sys_reg_desc *r;
 
-	table = get_target_table(vcpu->arch.target, false, &num);
+	if (!table)
+		return -1;	/* Not handled */
 
-	/* Search target-specific then generic table. */
 	r = find_reg(params, table, num);
-	if (!r)
-		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
 
-	if (likely(r)) {
+	if (r) {
 		/*
 		 * Not having an accessor means that we have
 		 * configured a trap that we don't know how to
@@ -621,22 +628,51 @@ static void emulate_cp15(struct kvm_vcpu *vcpu,
 		if (likely(r->access(vcpu, params, r))) {
 			/* Skip instruction, since it was emulated */
 			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-			return;
 		}
-		/* If access function fails, it should complain. */
+
+		/* Handled */
+		return 0;
 	}
 
-	kvm_err("Unsupported guest CP15 access at: %08lx\n", *vcpu_pc(vcpu));
+	/* Not handled */
+	return -1;
+}
+
+static void unhandled_cp_access(struct kvm_vcpu *vcpu,
+				struct sys_reg_params *params)
+{
+	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+	int cp;
+
+	switch(hsr_ec) {
+	case ESR_EL2_EC_CP15_32:
+	case ESR_EL2_EC_CP15_64:
+		cp = 15;
+		break;
+	case ESR_EL2_EC_CP14_MR:
+	case ESR_EL2_EC_CP14_64:
+		cp = 14;
+		break;
+	default:
+		WARN_ON((cp = -1));
+	}
+
+	kvm_err("Unsupported guest CP%d access at: %08lx\n",
+		cp, *vcpu_pc(vcpu));
 	print_sys_reg_instr(params);
 	kvm_inject_undefined(vcpu);
 }
 
 /**
- * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
+ * kvm_handle_cp_64 -- handles a mrrc/mcrr trap on a guest CP15 access
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -665,8 +701,14 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		*vcpu_reg(vcpu, params.Rt) = val;
 	}
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		goto out;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		goto out;
 
+	unhandled_cp_access(vcpu, &params);
+
+out:
 	/* Do the opposite hack for the read side */
 	if (!params.is_write) {
 		u64 val = *vcpu_reg(vcpu, params.Rt);
@@ -682,7 +724,11 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -697,10 +743,51 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op1 = (hsr >> 14) & 0x7;
 	params.Op2 = (hsr >> 17) & 0x7;
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
 	return 1;
 }
 
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_64(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_32(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_64(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_32(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
 static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 			   const struct sys_reg_params *params)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (3 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15 Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 6/9] arm64: KVM: check ordering of all system register tables Marc Zyngier
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

An interesting "feature" of the CP14 encoding is that there is
an overlap between 32 and 64bit registers, meaning they cannot
live in the same table as we did for CP15.

Create separate tables for 64bit CP14 and CP15 registers, and
let the top level handler use the right one.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fb6eece..1fb1bff 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -487,13 +487,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 static const struct sys_reg_desc cp14_regs[] = {
 };
 
+/* Trapped cp14 64bit registers */
+static const struct sys_reg_desc cp14_64_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
  * register).
  */
 static const struct sys_reg_desc cp15_regs[] = {
-	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
@@ -534,6 +537,10 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
 
+};
+
+static const struct sys_reg_desc cp15_64_regs[] = {
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
 };
 
@@ -759,7 +766,7 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 	target_specific = get_target_table(vcpu->arch.target, false, &num);
 	return kvm_handle_cp_64(vcpu,
-				cp15_regs, ARRAY_SIZE(cp15_regs),
+				cp15_64_regs, ARRAY_SIZE(cp15_64_regs),
 				target_specific, num);
 }
 
@@ -777,7 +784,7 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	return kvm_handle_cp_64(vcpu,
-				cp14_regs, ARRAY_SIZE(cp14_regs),
+				cp14_64_regs, ARRAY_SIZE(cp14_64_regs),
 				NULL, 0);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 6/9] arm64: KVM: check ordering of all system register tables
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (4 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers Marc Zyngier
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1fb1bff..9147b0c 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1299,14 +1299,32 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
+static int check_sysreg_table(const struct sys_reg_desc *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_sys_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("sys_reg table %p out of order (%d)\n", table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void kvm_sys_reg_table_init(void)
 {
 	unsigned int i;
 	struct sys_reg_desc clidr;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(sys_reg_descs); i++)
-		BUG_ON(cmp_sys_reg(&sys_reg_descs[i-1], &sys_reg_descs[i]) >= 0);
+	BUG_ON(check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs)));
+	BUG_ON(check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs)));
+	BUG_ON(check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs)));
+	BUG_ON(check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs)));
+	BUG_ON(check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_sys_regs); i++)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (5 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 6/9] arm64: KVM: check ordering of all system register tables Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-07-09  9:43   ` Christoffer Dall
  2014-06-20 13:00 ` [PATCH v3 8/9] arm64: KVM: implement lazy world switch for " Marc Zyngier
  2014-06-20 13:00 ` [PATCH v3 9/9] arm64: KVM: enable trapping of all " Marc Zyngier
  8 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the AArch32 debug registers that are accessible
from EL0 or EL1. The code follow the same strategy as the AArch64
counterpart with regards to tracking the dirty state of the debug
registers.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h |   9 +++
 arch/arm64/kvm/sys_regs.c        | 144 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 151 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 12f9dd7..993a7db 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,6 +93,15 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
+
+#define cp14_DBGDSCRext	(MDSCR_EL1 * 2)
+#define cp14_DBGBCR0	(DBGBCR0_EL1 * 2)
+#define cp14_DBGBVR0	(DBGBVR0_EL1 * 2)
+#define cp14_DBGBXVR0	(cp14_DBGBVR0 + 1)
+#define cp14_DBGWCR0	(DBGWCR0_EL1 * 2)
+#define cp14_DBGWVR0	(DBGWVR0_EL1 * 2)
+#define cp14_DBGDCCINT	(MDCCINT_EL1 * 2)
+
 #define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9147b0c..daa635e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -483,12 +483,153 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
-/* Trapped cp14 registers */
+static bool trap_dbgidr(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
+		u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+		u32 el3 = !!((pfr >> 12) & 0xf);
+
+		*vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) |
+					  (((dfr >> 12) & 0xf) << 24) |
+					  (((dfr >> 28) & 0xf) << 20) |
+					  (6 << 16) | (el3 << 14) | (el3 << 12));
+		return true;
+	}
+}
+
+static bool trap_debug32(struct kvm_vcpu *vcpu,
+			 const struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+	}
+
+	return true;
+}
+
+#define DBG_BCR_BVR_WCR_WVR(n)					\
+	/* DBGBVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 4), trap_debug32,	\
+	  NULL, (cp14_DBGBVR0 + (n) * 2) },			\
+	/* DBGBCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 5), trap_debug32,	\
+	  NULL, (cp14_DBGBCR0 + (n) * 2) },			\
+	/* DBGWVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 6), trap_debug32,	\
+	  NULL, (cp14_DBGWVR0 + (n) * 2) },			\
+	/* DBGWCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 7), trap_debug32,	\
+	  NULL, (cp14_DBGWCR0 + (n) * 2) }
+
+#define DBGBXVR(n)						\
+	{ Op1( 0), CRn( 1), CRm((n)), Op2( 1), trap_debug32,	\
+	  NULL, cp14_DBGBXVR0 + n * 2 }
+
+/*
+ * Trapped cp14 registers. We generally ignore most of the external
+ * debug, on the principle that they don't really make sense to a
+ * guest. Revisit this one day, whould this principle change.
+ */
 static const struct sys_reg_desc cp14_regs[] = {
+	/* DBGIDR */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 0), trap_dbgidr },
+	/* DBGDTRRXext */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 2), trap_raz_wi },
+
+	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDSCRint */
+	{ Op1( 0), CRn( 0), CRm( 1), Op2( 0), trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDCCINT */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32 },
+	/* DBGDSCRext */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(2),
+	/* DBGDTR[RT]Xint */
+	{ Op1( 0), CRn( 0), CRm( 3), Op2( 0), trap_raz_wi },
+	/* DBGDTR[RT]Xext */
+	{ Op1( 0), CRn( 0), CRm( 3), Op2( 2), trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(3),
+	DBG_BCR_BVR_WCR_WVR(4),
+	DBG_BCR_BVR_WCR_WVR(5),
+	/* DBGWFAR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 0), trap_raz_wi },
+	/* DBGOSECCR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 2), trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(6),
+	/* DBGVCR */
+	{ Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(7),
+	DBG_BCR_BVR_WCR_WVR(8),
+	DBG_BCR_BVR_WCR_WVR(9),
+	DBG_BCR_BVR_WCR_WVR(10),
+	DBG_BCR_BVR_WCR_WVR(11),
+	DBG_BCR_BVR_WCR_WVR(12),
+	DBG_BCR_BVR_WCR_WVR(13),
+	DBG_BCR_BVR_WCR_WVR(14),
+	DBG_BCR_BVR_WCR_WVR(15),
+
+	/* DBGDRAR (32bit) */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), trap_raz_wi },
+
+	DBGBXVR(0),
+	/* DBGOSLAR */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_raz_wi },
+	DBGBXVR(1),
+	/* DBGOSLSR */
+	{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1 },
+	DBGBXVR(2),
+	DBGBXVR(3),
+	/* DBGOSDLR */
+	{ Op1( 0), CRn( 1), CRm( 3), Op2( 4), trap_raz_wi },
+	DBGBXVR(4),
+	/* DBGPRCR */
+	{ Op1( 0), CRn( 1), CRm( 4), Op2( 4), trap_raz_wi },
+	DBGBXVR(5),
+	DBGBXVR(6),
+	DBGBXVR(7),
+	DBGBXVR(8),
+	DBGBXVR(9),
+	DBGBXVR(10),
+	DBGBXVR(11),
+	DBGBXVR(12),
+	DBGBXVR(13),
+	DBGBXVR(14),
+	DBGBXVR(15),
+
+	/* DBGDSAR (32bit) */
+	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), trap_raz_wi },
+
+	/* DBGDEVID2 */
+	{ Op1( 0), CRn( 7), CRm( 0), Op2( 7), trap_raz_wi },
+	/* DBGDEVID1 */
+	{ Op1( 0), CRn( 7), CRm( 1), Op2( 7), trap_raz_wi },
+	/* DBGDEVID */
+	{ Op1( 0), CRn( 7), CRm( 2), Op2( 7), trap_raz_wi },
+	/* DBGCLAIMSET */
+	{ Op1( 0), CRn( 7), CRm( 8), Op2( 6), trap_raz_wi },
+	/* DBGCLAIMCLR */
+	{ Op1( 0), CRn( 7), CRm( 9), Op2( 6), trap_raz_wi },
+	/* DBGAUTHSTATUS */
+	{ Op1( 0), CRn( 7), CRm(14), Op2( 6), trap_dbgauthstatus_el1 },
 };
 
 /* Trapped cp14 64bit registers */
 static const struct sys_reg_desc cp14_64_regs[] = {
+	/* DBGDRAR (64bit) */
+	{ Op1( 0), CRm( 1), .access = trap_raz_wi },
+
+	/* DBGDSAR (64bit) */
+	{ Op1( 0), CRm( 2), .access = trap_raz_wi },
 };
 
 /*
@@ -536,7 +677,6 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 0), access_vm_reg, NULL, c10_AMAIR0 },
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
-
 };
 
 static const struct sys_reg_desc cp15_64_regs[] = {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (6 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  2014-07-09  9:45   ` Christoffer Dall
  2014-06-20 13:00 ` [PATCH v3 9/9] arm64: KVM: enable trapping of all " Marc Zyngier
  8 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A57 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").

Also, we only save/restore them when MDSCR_EL1 has debug enabled,
or when we've flagged the debug registers as dirty. It means that
most of the time, we only save/restore MDSCR_EL1.

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c |   1 +
 arch/arm64/kvm/hyp.S            | 462 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 457 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 646f888..ae73a83 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,6 +120,7 @@ int main(void)
   DEFINE(VCPU_ESR_EL2,		offsetof(struct kvm_vcpu, arch.fault.esr_el2));
   DEFINE(VCPU_FAR_EL2,		offsetof(struct kvm_vcpu, arch.fault.far_el2));
   DEFINE(VCPU_HPFAR_EL2,	offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
+  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
   DEFINE(VCPU_HCR_EL2,		offsetof(struct kvm_vcpu, arch.hcr_el2));
   DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b0d1512..727087c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -21,6 +21,7 @@
 #include <asm/assembler.h>
 #include <asm/memory.h>
 #include <asm/asm-offsets.h>
+#include <asm/debug-monitors.h>
 #include <asm/fpsimdmacros.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -215,6 +216,7 @@ __kvm_hyp_code_start:
 	mrs	x22, 	amair_el1
 	mrs	x23, 	cntkctl_el1
 	mrs	x24,	par_el1
+	mrs	x25,	mdscr_el1
 
 	stp	x4, x5, [x3]
 	stp	x6, x7, [x3, #16]
@@ -226,7 +228,202 @@ __kvm_hyp_code_start:
 	stp	x18, x19, [x3, #112]
 	stp	x20, x21, [x3, #128]
 	stp	x22, x23, [x3, #144]
-	str	x24, [x3, #160]
+	stp	x24, x25, [x3, #160]
+.endm
+
+.macro save_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbcr15_el1
+	mrs	x19, dbgbcr14_el1
+	mrs	x18, dbgbcr13_el1
+	mrs	x17, dbgbcr12_el1
+	mrs	x16, dbgbcr11_el1
+	mrs	x15, dbgbcr10_el1
+	mrs	x14, dbgbcr9_el1
+	mrs	x13, dbgbcr8_el1
+	mrs	x12, dbgbcr7_el1
+	mrs	x11, dbgbcr6_el1
+	mrs	x10, dbgbcr5_el1
+	mrs	x9, dbgbcr4_el1
+	mrs	x8, dbgbcr3_el1
+	mrs	x7, dbgbcr2_el1
+	mrs	x6, dbgbcr1_el1
+	mrs	x5, dbgbcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbvr15_el1
+	mrs	x19, dbgbvr14_el1
+	mrs	x18, dbgbvr13_el1
+	mrs	x17, dbgbvr12_el1
+	mrs	x16, dbgbvr11_el1
+	mrs	x15, dbgbvr10_el1
+	mrs	x14, dbgbvr9_el1
+	mrs	x13, dbgbvr8_el1
+	mrs	x12, dbgbvr7_el1
+	mrs	x11, dbgbvr6_el1
+	mrs	x10, dbgbvr5_el1
+	mrs	x9, dbgbvr4_el1
+	mrs	x8, dbgbvr3_el1
+	mrs	x7, dbgbvr2_el1
+	mrs	x6, dbgbvr1_el1
+	mrs	x5, dbgbvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwcr15_el1
+	mrs	x19, dbgwcr14_el1
+	mrs	x18, dbgwcr13_el1
+	mrs	x17, dbgwcr12_el1
+	mrs	x16, dbgwcr11_el1
+	mrs	x15, dbgwcr10_el1
+	mrs	x14, dbgwcr9_el1
+	mrs	x13, dbgwcr8_el1
+	mrs	x12, dbgwcr7_el1
+	mrs	x11, dbgwcr6_el1
+	mrs	x10, dbgwcr5_el1
+	mrs	x9, dbgwcr4_el1
+	mrs	x8, dbgwcr3_el1
+	mrs	x7, dbgwcr2_el1
+	mrs	x6, dbgwcr1_el1
+	mrs	x5, dbgwcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwvr15_el1
+	mrs	x19, dbgwvr14_el1
+	mrs	x18, dbgwvr13_el1
+	mrs	x17, dbgwvr12_el1
+	mrs	x16, dbgwvr11_el1
+	mrs	x15, dbgwvr10_el1
+	mrs	x14, dbgwvr9_el1
+	mrs	x13, dbgwvr8_el1
+	mrs	x12, dbgwvr7_el1
+	mrs	x11, dbgwvr6_el1
+	mrs	x10, dbgwvr5_el1
+	mrs	x9, dbgwvr4_el1
+	mrs	x8, dbgwvr3_el1
+	mrs	x7, dbgwvr2_el1
+	mrs	x6, dbgwvr1_el1
+	mrs	x5, dbgwvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	mrs	x21, mdccint_el1
+	str	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
 .endm
 
 .macro restore_sysregs
@@ -245,7 +442,7 @@ __kvm_hyp_code_start:
 	ldp	x18, x19, [x3, #112]
 	ldp	x20, x21, [x3, #128]
 	ldp	x22, x23, [x3, #144]
-	ldr	x24, [x3, #160]
+	ldp	x24, x25, [x3, #160]
 
 	msr	vmpidr_el2,	x4
 	msr	csselr_el1,	x5
@@ -268,6 +465,198 @@ __kvm_hyp_code_start:
 	msr	amair_el1,	x22
 	msr	cntkctl_el1,	x23
 	msr	par_el1,	x24
+	msr	mdscr_el1,	x25
+.endm
+
+.macro restore_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbcr15_el1, x20
+	msr	dbgbcr14_el1, x19
+	msr	dbgbcr13_el1, x18
+	msr	dbgbcr12_el1, x17
+	msr	dbgbcr11_el1, x16
+	msr	dbgbcr10_el1, x15
+	msr	dbgbcr9_el1, x14
+	msr	dbgbcr8_el1, x13
+	msr	dbgbcr7_el1, x12
+	msr	dbgbcr6_el1, x11
+	msr	dbgbcr5_el1, x10
+	msr	dbgbcr4_el1, x9
+	msr	dbgbcr3_el1, x8
+	msr	dbgbcr2_el1, x7
+	msr	dbgbcr1_el1, x6
+	msr	dbgbcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbvr15_el1, x20
+	msr	dbgbvr14_el1, x19
+	msr	dbgbvr13_el1, x18
+	msr	dbgbvr12_el1, x17
+	msr	dbgbvr11_el1, x16
+	msr	dbgbvr10_el1, x15
+	msr	dbgbvr9_el1, x14
+	msr	dbgbvr8_el1, x13
+	msr	dbgbvr7_el1, x12
+	msr	dbgbvr6_el1, x11
+	msr	dbgbvr5_el1, x10
+	msr	dbgbvr4_el1, x9
+	msr	dbgbvr3_el1, x8
+	msr	dbgbvr2_el1, x7
+	msr	dbgbvr1_el1, x6
+	msr	dbgbvr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwcr15_el1, x20
+	msr	dbgwcr14_el1, x19
+	msr	dbgwcr13_el1, x18
+	msr	dbgwcr12_el1, x17
+	msr	dbgwcr11_el1, x16
+	msr	dbgwcr10_el1, x15
+	msr	dbgwcr9_el1, x14
+	msr	dbgwcr8_el1, x13
+	msr	dbgwcr7_el1, x12
+	msr	dbgwcr6_el1, x11
+	msr	dbgwcr5_el1, x10
+	msr	dbgwcr4_el1, x9
+	msr	dbgwcr3_el1, x8
+	msr	dbgwcr2_el1, x7
+	msr	dbgwcr1_el1, x6
+	msr	dbgwcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwvr15_el1, x20
+	msr	dbgwvr14_el1, x19
+	msr	dbgwvr13_el1, x18
+	msr	dbgwvr12_el1, x17
+	msr	dbgwvr11_el1, x16
+	msr	dbgwvr10_el1, x15
+	msr	dbgwvr9_el1, x14
+	msr	dbgwvr8_el1, x13
+	msr	dbgwvr7_el1, x12
+	msr	dbgwvr6_el1, x11
+	msr	dbgwvr5_el1, x10
+	msr	dbgwvr4_el1, x9
+	msr	dbgwvr3_el1, x8
+	msr	dbgwvr2_el1, x7
+	msr	dbgwvr1_el1, x6
+	msr	dbgwvr0_el1, x5
+
+	ldr	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
+	msr	mdccint_el1, x21
 .endm
 
 .macro skip_32bit_state tmp, target
@@ -282,6 +671,35 @@ __kvm_hyp_code_start:
 	tbz	\tmp, #12, \target
 .endm
 
+.macro skip_debug_state tmp, target
+	ldr	\tmp, [x0, #VCPU_DEBUG_FLAGS]
+	tbz	\tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
+.endm
+
+.macro compute_debug_state target
+	// Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
+	// is set, we do a full save/restore cycle and disable trapping.
+	add	x25, x0, #VCPU_CONTEXT
+
+	// Check the state of MDSCR_EL1
+	ldr	x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
+	and	x26, x25, #DBG_MDSCR_KDE
+	and	x25, x25, #DBG_MDSCR_MDE
+	adds	xzr, x25, x26
+	b.eq	9998f		// Nothing to see there
+
+	// If any interesting bits was set, we must set the flag
+	mov	x26, #KVM_ARM64_DEBUG_DIRTY
+	str	x26, [x0, #VCPU_DEBUG_FLAGS]
+	b	9999f		// Don't skip restore
+
+9998:
+	// Otherwise load the flags from memory in case we recently
+	// trapped
+	skip_debug_state x25, \target
+9999:
+.endm
+
 .macro save_guest_32bit_state
 	skip_32bit_state x3, 1f
 
@@ -297,10 +715,13 @@ __kvm_hyp_code_start:
 	mrs	x4, dacr32_el2
 	mrs	x5, ifsr32_el2
 	mrs	x6, fpexc32_el2
-	mrs	x7, dbgvcr32_el2
 	stp	x4, x5, [x3]
-	stp	x6, x7, [x3, #16]
+	str	x6, [x3, #16]
 
+	skip_debug_state x8, 2f
+	mrs	x7, dbgvcr32_el2
+	str	x7, [x3, #24]
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -323,12 +744,15 @@ __kvm_hyp_code_start:
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
 	ldp	x4, x5, [x3]
-	ldp	x6, x7, [x3, #16]
+	ldr	x6, [x3, #16]
 	msr	dacr32_el2, x4
 	msr	ifsr32_el2, x5
 	msr	fpexc32_el2, x6
-	msr	dbgvcr32_el2, x7
 
+	skip_debug_state x8, 2f
+	ldr	x7, [x3, #24]
+	msr	dbgvcr32_el2, x7
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -537,6 +961,14 @@ __restore_sysregs:
 	restore_sysregs
 	ret
 
+__save_debug:
+	save_debug
+	ret
+
+__restore_debug:
+	restore_debug
+	ret
+
 __save_fpsimd:
 	save_fpsimd
 	ret
@@ -568,6 +1000,9 @@ ENTRY(__kvm_vcpu_run)
 	bl __save_fpsimd
 	bl __save_sysregs
 
+	compute_debug_state 1f
+	bl	__save_debug
+1:
 	activate_traps
 	activate_vm
 
@@ -579,6 +1014,10 @@ ENTRY(__kvm_vcpu_run)
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_debug_state x3, 1f
+	bl	__restore_debug
+1:
 	restore_guest_32bit_state
 	restore_guest_regs
 
@@ -595,6 +1034,10 @@ __kvm_vcpu_return:
 	save_guest_regs
 	bl __save_fpsimd
 	bl __save_sysregs
+
+	skip_debug_state x3, 1f
+	bl	__save_debug
+1:
 	save_guest_32bit_state
 
 	save_timer_state
@@ -609,6 +1052,13 @@ __kvm_vcpu_return:
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_debug_state x3, 1f
+	// Clear the dirty flag for the next run, as all the state has
+	// already been saved.
+	str	xzr, [x0, #VCPU_DEBUG_FLAGS]
+	bl	__restore_debug
+1:
 	restore_host_regs
 
 	mov	x0, x1
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 9/9] arm64: KVM: enable trapping of all debug registers
  2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
                   ` (7 preceding siblings ...)
  2014-06-20 13:00 ` [PATCH v3 8/9] arm64: KVM: implement lazy world switch for " Marc Zyngier
@ 2014-06-20 13:00 ` Marc Zyngier
  8 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-06-20 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Enable trapping of the debug registers, preventing the guests to
mess with the host state (and allowing guests to use the debug
infrastructure as well).

Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp.S | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 727087c..e36ca91 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -777,6 +777,14 @@ __kvm_hyp_code_start:
 	mrs	x2, mdcr_el2
 	and	x2, x2, #MDCR_EL2_HPMN_MASK
 	orr	x2, x2, #(MDCR_EL2_TPM | MDCR_EL2_TPMCR)
+	orr	x2, x2, #(MDCR_EL2_TDRA | MDCR_EL2_TDOSA)
+
+	// Check for KVM_ARM64_DEBUG_DIRTY, and set debug to trap
+	// if not dirty.
+	ldr	x3, [x0, #VCPU_DEBUG_FLAGS]
+	tbnz	x3, #KVM_ARM64_DEBUG_DIRTY_SHIFT, 1f
+	orr	x2, x2,  #MDCR_EL2_TDA
+1:
 	msr	mdcr_el2, x2
 .endm
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi
  2014-06-20 12:59 ` [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi Marc Zyngier
@ 2014-07-09  9:27   ` Christoffer Dall
  2014-07-09  9:36     ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2014-07-09  9:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 20, 2014 at 01:59:59PM +0100, Marc Zyngier wrote:
> pm_fake doesn't quite describe what the handler does (ignoring writes
> and returning 0 for reads).
> 
> As we're about to use it (a lot) in a different context, rename it
> with a (admitedly cryptic) name that make sense for all users.
> 
> Reviewed-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

didn't I already review this?  Did something change, it doesn't look
like it from the changelog....

/me confused.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi
  2014-07-09  9:27   ` Christoffer Dall
@ 2014-07-09  9:36     ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-07-09  9:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 09 2014 at 10:27:12 am BST, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Fri, Jun 20, 2014 at 01:59:59PM +0100, Marc Zyngier wrote:
>> pm_fake doesn't quite describe what the handler does (ignoring writes
>> and returning 0 for reads).
>> 
>> As we're about to use it (a lot) in a different context, rename it
>> with a (admitedly cryptic) name that make sense for all users.
>> 
>> Reviewed-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>
> didn't I already review this?  Did something change, it doesn't look
> like it from the changelog....
>
> /me confused.

Nothing changed, just me forgetting to add your Reviewed-by tag.

Sorry about that.

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-06-20 13:00 ` [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers Marc Zyngier
@ 2014-07-09  9:38   ` Christoffer Dall
  2014-07-09 11:09     ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2014-07-09  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 20, 2014 at 02:00:01PM +0100, Marc Zyngier wrote:
> Add handlers for all the AArch64 debug registers that are accessible
> from EL0 or EL1. The trapping code keeps track of the state of the
> debug registers, allowing for the switch code to implement a lazy
> switching strategy.
> 
> Reviewed-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
>  arch/arm64/include/asm/kvm_host.h |   3 +
>  arch/arm64/kvm/sys_regs.c         | 137 +++++++++++++++++++++++++++++++++++++-
>  3 files changed, 159 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9fcd54b..e6b159a 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -43,14 +43,25 @@
>  #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
>  #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
>  #define	PAR_EL1		21	/* Physical Address Register */
> +#define MDSCR_EL1	22	/* Monitor Debug System Control Register */
> +#define DBGBCR0_EL1	23	/* Debug Breakpoint Control Registers (0-15) */
> +#define DBGBCR15_EL1	38
> +#define DBGBVR0_EL1	39	/* Debug Breakpoint Value Registers (0-15) */
> +#define DBGBVR15_EL1	54
> +#define DBGWCR0_EL1	55	/* Debug Watchpoint Control Registers (0-15) */
> +#define DBGWCR15_EL1	70
> +#define DBGWVR0_EL1	71	/* Debug Watchpoint Value Registers (0-15) */
> +#define DBGWVR15_EL1	86
> +#define MDCCINT_EL1	87	/* Monitor Debug Comms Channel Interrupt Enable Reg */
> +
>  /* 32bit specific registers. Keep them at the end of the range */
> -#define	DACR32_EL2	22	/* Domain Access Control Register */
> -#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
> -#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
> -#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
> -#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
> -#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
> -#define	NR_SYS_REGS	28
> +#define	DACR32_EL2	88	/* Domain Access Control Register */
> +#define	IFSR32_EL2	89	/* Instruction Fault Status Register */
> +#define	FPEXC32_EL2	90	/* Floating-Point Exception Control Register */
> +#define	DBGVCR32_EL2	91	/* Debug Vector Catch Register */
> +#define	TEECR32_EL1	92	/* ThumbEE Configuration Register */
> +#define	TEEHBR32_EL1	93	/* ThumbEE Handler Base Register */
> +#define	NR_SYS_REGS	94
>  
>  /* 32bit mapping */
>  #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
> @@ -87,6 +98,9 @@
>  #define ARM_EXCEPTION_IRQ	  0
>  #define ARM_EXCEPTION_TRAP	  1
>  
> +#define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
> +#define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 92242ce..79573c86 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
>  	/* Exception Information */
>  	struct kvm_vcpu_fault_info fault;
>  
> +	/* Debug state */
> +	u64 debug_flags;
> +
>  	/* Pointer to host CPU context */
>  	kvm_cpu_context_t *host_cpu_context;
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 4abd84e..808e3b2 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -30,6 +30,7 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/cacheflush.h>
>  #include <asm/cputype.h>
> +#include <asm/debug-monitors.h>
>  #include <trace/events/kvm.h>
>  
>  #include "sys_regs.h"
> @@ -173,6 +174,60 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
>  		return read_zero(vcpu, p);
>  }
>  
> +static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
> +			   const struct sys_reg_params *p,
> +			   const struct sys_reg_desc *r)
> +{
> +	if (p->is_write) {
> +		return ignore_write(vcpu, p);
> +	} else {
> +		*vcpu_reg(vcpu, p->Rt) = (1 << 3);
> +		return true;
> +	}
> +}
> +
> +static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
> +				   const struct sys_reg_params *p,
> +				   const struct sys_reg_desc *r)
> +{
> +	if (p->is_write) {
> +		return ignore_write(vcpu, p);
> +	} else {
> +		u32 val;
> +		asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
> +		*vcpu_reg(vcpu, p->Rt) = val;
> +		return true;
> +	}
> +}
> +
> +/*
> + * We want to avoid world-switching all the DBG registers all the
> + * time. For this, we use a DIRTY but, indicating the guest has

a DIRTY but?  (at least there's only one t in there).

> + * modified the debug registers, and only restore the registers once,
> + * disabling traps.

I don't think I understand the "only restore the registers once" bit
here.  I know I'm being incredibly stupid, but I forgot since the last
review round how this actually works; when we return from the guest and
the guest has somehow enabled certain DBG functionality, then we set the dirty
flag, which means we should stop trapping and context switch all the
registers on world-switches, but if we see when returning from the guest
that the guest doesn't appear to be using the registers we enable
trapping and stop world-switching, right?

Do we clearly define which state triggers the world-switching and why
that's a good rationale? (sorry, the debug architecture is not my
favorite part of the ARM ARM).

> + *
> + * The best thing to do would be to trap MDSCR_EL1 independently, test
> + * if DBG_MDSCR_KDE or DBG_MDSCR_MDE is getting set, and only set the
> + * DIRTY bit in that case.
> + *
> + * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1 like a
> + * woodpecker on a tree, and it is better to disable trapping as soon
> + * as possible in this case. Some day, make this a tuneable...
> + */
> +static bool trap_debug_regs(struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_params *p,
> +			    const struct sys_reg_desc *r)
> +{
> +	if (p->is_write) {
> +		vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> +		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> +	} else {
> +		*vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
> +	}
> +
> +	return true;
> +}
> +
>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  {
>  	u64 amair;
> @@ -189,6 +244,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
>  }
>  
> +/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
> +#define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
> +	/* DBGBVRn_EL1 */						\
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),	\
> +	  trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },		\
> +	/* DBGBCRn_EL1 */						\
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),	\
> +	  trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },		\
> +	/* DBGWVRn_EL1 */						\
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),	\
> +	  trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 },		\
> +	/* DBGWCRn_EL1 */						\
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),	\
> +	  trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 }
> +
>  /*
>   * Architected system registers.
>   * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> @@ -201,8 +271,12 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>   * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>   * all PM registers, which doesn't crash the guest kernel at least.
>   *
> - * Same goes for the whole debug infrastructure, which probably breaks
> - * some guest functionnality. This should be fixed.
> + * Debug handling: We do trap most, if not all debug related system
> + * registers. The implementation is good enough to ensure that a guest
> + * can use these with minimal performance degradation. The drawback is
> + * that we don't implement any of the external debug, none of the
> + * OSlock protocol. This should be revisited if we ever encounter a
> + * more demanding guest...
>   */
>  static const struct sys_reg_desc sys_reg_descs[] = {
>  	/* DC ISW */
> @@ -215,12 +289,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
>  	  access_dcsw },
>  
> +	DBG_BCR_BVR_WCR_WVR_EL1(0),
> +	DBG_BCR_BVR_WCR_WVR_EL1(1),
> +	/* MDCCINT_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
> +	  trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
> +	/* MDSCR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
> +	  trap_debug_regs, reset_val, MDSCR_EL1, 0 },
> +	DBG_BCR_BVR_WCR_WVR_EL1(2),
> +	DBG_BCR_BVR_WCR_WVR_EL1(3),
> +	DBG_BCR_BVR_WCR_WVR_EL1(4),
> +	DBG_BCR_BVR_WCR_WVR_EL1(5),
> +	DBG_BCR_BVR_WCR_WVR_EL1(6),
> +	DBG_BCR_BVR_WCR_WVR_EL1(7),
> +	DBG_BCR_BVR_WCR_WVR_EL1(8),
> +	DBG_BCR_BVR_WCR_WVR_EL1(9),
> +	DBG_BCR_BVR_WCR_WVR_EL1(10),
> +	DBG_BCR_BVR_WCR_WVR_EL1(11),
> +	DBG_BCR_BVR_WCR_WVR_EL1(12),
> +	DBG_BCR_BVR_WCR_WVR_EL1(13),
> +	DBG_BCR_BVR_WCR_WVR_EL1(14),
> +	DBG_BCR_BVR_WCR_WVR_EL1(15),
> +
> +	/* MDRAR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
> +	  trap_raz_wi },
> +	/* OSLAR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
> +	  trap_raz_wi },
> +	/* OSLSR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
> +	  trap_oslsr_el1 },
> +	/* OSDLR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
> +	  trap_raz_wi },
> +	/* DBGPRCR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
> +	  trap_raz_wi },
> +	/* DBGCLAIMSET_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
> +	  trap_raz_wi },
> +	/* DBGCLAIMCLR_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
> +	  trap_raz_wi },
> +	/* DBGAUTHSTATUS_EL1 */
> +	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
> +	  trap_dbgauthstatus_el1 },
> +
>  	/* TEECR32_EL1 */
>  	{ Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
>  	  NULL, reset_val, TEECR32_EL1, 0 },
>  	/* TEEHBR32_EL1 */
>  	{ Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
>  	  NULL, reset_val, TEEHBR32_EL1, 0 },
> +
> +	/* MDCCSR_EL1 */
> +	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
> +	  trap_raz_wi },
> +	/* DBGDTR_EL0 */
> +	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
> +	  trap_raz_wi },
> +	/* DBGDTR[TR]X_EL0 */
> +	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
> +	  trap_raz_wi },
> +
>  	/* DBGVCR32_EL2 */
>  	{ Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
>  	  NULL, reset_val, DBGVCR32_EL2, 0 },
> -- 
> 1.8.3.4
> 

Besides the commenting stuff above:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
  2014-06-20 13:00 ` [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers Marc Zyngier
@ 2014-07-09  9:43   ` Christoffer Dall
  0 siblings, 0 replies; 19+ messages in thread
From: Christoffer Dall @ 2014-07-09  9:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 20, 2014 at 02:00:05PM +0100, Marc Zyngier wrote:
> Add handlers for all the AArch32 debug registers that are accessible
> from EL0 or EL1. The code follow the same strategy as the AArch64
> counterpart with regards to tracking the dirty state of the debug
> registers.
> 
> Reviewed-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h |   9 +++
>  arch/arm64/kvm/sys_regs.c        | 144 ++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 151 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 12f9dd7..993a7db 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -93,6 +93,15 @@
>  #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
>  #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
>  #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
> +
> +#define cp14_DBGDSCRext	(MDSCR_EL1 * 2)
> +#define cp14_DBGBCR0	(DBGBCR0_EL1 * 2)
> +#define cp14_DBGBVR0	(DBGBVR0_EL1 * 2)
> +#define cp14_DBGBXVR0	(cp14_DBGBVR0 + 1)
> +#define cp14_DBGWCR0	(DBGWCR0_EL1 * 2)
> +#define cp14_DBGWVR0	(DBGWVR0_EL1 * 2)
> +#define cp14_DBGDCCINT	(MDCCINT_EL1 * 2)
> +
>  #define NR_COPRO_REGS	(NR_SYS_REGS * 2)
>  
>  #define ARM_EXCEPTION_IRQ	  0
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 9147b0c..daa635e 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -483,12 +483,153 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	  NULL, reset_val, FPEXC32_EL2, 0x70 },
>  };
>  
> -/* Trapped cp14 registers */
> +static bool trap_dbgidr(struct kvm_vcpu *vcpu,
> +			const struct sys_reg_params *p,
> +			const struct sys_reg_desc *r)
> +{
> +	if (p->is_write) {
> +		return ignore_write(vcpu, p);
> +	} else {
> +		u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
> +		u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
> +		u32 el3 = !!((pfr >> 12) & 0xf);
> +
> +		*vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) |
> +					  (((dfr >> 12) & 0xf) << 24) |
> +					  (((dfr >> 28) & 0xf) << 20) |
> +					  (6 << 16) | (el3 << 14) | (el3 << 12));
> +		return true;
> +	}
> +}
> +
> +static bool trap_debug32(struct kvm_vcpu *vcpu,
> +			 const struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	if (p->is_write) {
> +		vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> +		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> +	} else {
> +		*vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
> +	}
> +
> +	return true;
> +}
> +
> +#define DBG_BCR_BVR_WCR_WVR(n)					\
> +	/* DBGBVRn */						\
> +	{ Op1( 0), CRn( 0), CRm((n)), Op2( 4), trap_debug32,	\
> +	  NULL, (cp14_DBGBVR0 + (n) * 2) },			\
> +	/* DBGBCRn */						\
> +	{ Op1( 0), CRn( 0), CRm((n)), Op2( 5), trap_debug32,	\
> +	  NULL, (cp14_DBGBCR0 + (n) * 2) },			\
> +	/* DBGWVRn */						\
> +	{ Op1( 0), CRn( 0), CRm((n)), Op2( 6), trap_debug32,	\
> +	  NULL, (cp14_DBGWVR0 + (n) * 2) },			\
> +	/* DBGWCRn */						\
> +	{ Op1( 0), CRn( 0), CRm((n)), Op2( 7), trap_debug32,	\
> +	  NULL, (cp14_DBGWCR0 + (n) * 2) }
> +
> +#define DBGBXVR(n)						\
> +	{ Op1( 0), CRn( 1), CRm((n)), Op2( 1), trap_debug32,	\
> +	  NULL, cp14_DBGBXVR0 + n * 2 }
> +
> +/*
> + * Trapped cp14 registers. We generally ignore most of the external
> + * debug, on the principle that they don't really make sense to a
> + * guest. Revisit this one day, whould this principle change.
> + */
>  static const struct sys_reg_desc cp14_regs[] = {
> +	/* DBGIDR */
> +	{ Op1( 0), CRn( 0), CRm( 0), Op2( 0), trap_dbgidr },
> +	/* DBGDTRRXext */
> +	{ Op1( 0), CRn( 0), CRm( 0), Op2( 2), trap_raz_wi },
> +
> +	DBG_BCR_BVR_WCR_WVR(0),
> +	/* DBGDSCRint */
> +	{ Op1( 0), CRn( 0), CRm( 1), Op2( 0), trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(1),
> +	/* DBGDCCINT */
> +	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32 },
> +	/* DBGDSCRext */
> +	{ Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32 },
> +	DBG_BCR_BVR_WCR_WVR(2),
> +	/* DBGDTR[RT]Xint */
> +	{ Op1( 0), CRn( 0), CRm( 3), Op2( 0), trap_raz_wi },
> +	/* DBGDTR[RT]Xext */
> +	{ Op1( 0), CRn( 0), CRm( 3), Op2( 2), trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(3),
> +	DBG_BCR_BVR_WCR_WVR(4),
> +	DBG_BCR_BVR_WCR_WVR(5),
> +	/* DBGWFAR */
> +	{ Op1( 0), CRn( 0), CRm( 6), Op2( 0), trap_raz_wi },
> +	/* DBGOSECCR */
> +	{ Op1( 0), CRn( 0), CRm( 6), Op2( 2), trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(6),
> +	/* DBGVCR */
> +	{ Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32 },
> +	DBG_BCR_BVR_WCR_WVR(7),
> +	DBG_BCR_BVR_WCR_WVR(8),
> +	DBG_BCR_BVR_WCR_WVR(9),
> +	DBG_BCR_BVR_WCR_WVR(10),
> +	DBG_BCR_BVR_WCR_WVR(11),
> +	DBG_BCR_BVR_WCR_WVR(12),
> +	DBG_BCR_BVR_WCR_WVR(13),
> +	DBG_BCR_BVR_WCR_WVR(14),
> +	DBG_BCR_BVR_WCR_WVR(15),
> +
> +	/* DBGDRAR (32bit) */
> +	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), trap_raz_wi },
> +
> +	DBGBXVR(0),
> +	/* DBGOSLAR */
> +	{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_raz_wi },
> +	DBGBXVR(1),
> +	/* DBGOSLSR */
> +	{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1 },
> +	DBGBXVR(2),
> +	DBGBXVR(3),
> +	/* DBGOSDLR */
> +	{ Op1( 0), CRn( 1), CRm( 3), Op2( 4), trap_raz_wi },
> +	DBGBXVR(4),
> +	/* DBGPRCR */
> +	{ Op1( 0), CRn( 1), CRm( 4), Op2( 4), trap_raz_wi },
> +	DBGBXVR(5),
> +	DBGBXVR(6),
> +	DBGBXVR(7),
> +	DBGBXVR(8),
> +	DBGBXVR(9),
> +	DBGBXVR(10),
> +	DBGBXVR(11),
> +	DBGBXVR(12),
> +	DBGBXVR(13),
> +	DBGBXVR(14),
> +	DBGBXVR(15),
> +
> +	/* DBGDSAR (32bit) */
> +	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), trap_raz_wi },
> +
> +	/* DBGDEVID2 */
> +	{ Op1( 0), CRn( 7), CRm( 0), Op2( 7), trap_raz_wi },
> +	/* DBGDEVID1 */
> +	{ Op1( 0), CRn( 7), CRm( 1), Op2( 7), trap_raz_wi },
> +	/* DBGDEVID */
> +	{ Op1( 0), CRn( 7), CRm( 2), Op2( 7), trap_raz_wi },
> +	/* DBGCLAIMSET */
> +	{ Op1( 0), CRn( 7), CRm( 8), Op2( 6), trap_raz_wi },
> +	/* DBGCLAIMCLR */
> +	{ Op1( 0), CRn( 7), CRm( 9), Op2( 6), trap_raz_wi },
> +	/* DBGAUTHSTATUS */
> +	{ Op1( 0), CRn( 7), CRm(14), Op2( 6), trap_dbgauthstatus_el1 },
>  };
>  
>  /* Trapped cp14 64bit registers */
>  static const struct sys_reg_desc cp14_64_regs[] = {
> +	/* DBGDRAR (64bit) */
> +	{ Op1( 0), CRm( 1), .access = trap_raz_wi },
> +
> +	/* DBGDSAR (64bit) */
> +	{ Op1( 0), CRm( 2), .access = trap_raz_wi },
>  };
>  
>  /*
> @@ -536,7 +677,6 @@ static const struct sys_reg_desc cp15_regs[] = {
>  	{ Op1( 0), CRn(10), CRm( 3), Op2( 0), access_vm_reg, NULL, c10_AMAIR0 },
>  	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
>  	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
> -
>  };
>  
>  static const struct sys_reg_desc cp15_64_regs[] = {
> -- 
> 1.8.3.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-06-20 13:00 ` [PATCH v3 8/9] arm64: KVM: implement lazy world switch for " Marc Zyngier
@ 2014-07-09  9:45   ` Christoffer Dall
  2014-07-09 11:18     ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2014-07-09  9:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 20, 2014 at 02:00:06PM +0100, Marc Zyngier wrote:
> Implement switching of the debug registers. While the number
> of registers is massive, CPUs usually don't implement them all
> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
> of 22 registers "only").
> 
> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
> or when we've flagged the debug registers as dirty. It means that
> most of the time, we only save/restore MDSCR_EL1.
> 
> Reviewed-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kernel/asm-offsets.c |   1 +
>  arch/arm64/kvm/hyp.S            | 462 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 457 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 646f888..ae73a83 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -120,6 +120,7 @@ int main(void)
>    DEFINE(VCPU_ESR_EL2,		offsetof(struct kvm_vcpu, arch.fault.esr_el2));
>    DEFINE(VCPU_FAR_EL2,		offsetof(struct kvm_vcpu, arch.fault.far_el2));
>    DEFINE(VCPU_HPFAR_EL2,	offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
>    DEFINE(VCPU_HCR_EL2,		offsetof(struct kvm_vcpu, arch.hcr_el2));
>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>    DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index b0d1512..727087c 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -21,6 +21,7 @@
>  #include <asm/assembler.h>
>  #include <asm/memory.h>
>  #include <asm/asm-offsets.h>
> +#include <asm/debug-monitors.h>
>  #include <asm/fpsimdmacros.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
> @@ -215,6 +216,7 @@ __kvm_hyp_code_start:
>  	mrs	x22, 	amair_el1
>  	mrs	x23, 	cntkctl_el1
>  	mrs	x24,	par_el1
> +	mrs	x25,	mdscr_el1
>  
>  	stp	x4, x5, [x3]
>  	stp	x6, x7, [x3, #16]
> @@ -226,7 +228,202 @@ __kvm_hyp_code_start:
>  	stp	x18, x19, [x3, #112]
>  	stp	x20, x21, [x3, #128]
>  	stp	x22, x23, [x3, #144]
> -	str	x24, [x3, #160]
> +	stp	x24, x25, [x3, #160]
> +.endm
> +
> +.macro save_debug
> +	// x2: base address for cpu context
> +	// x3: tmp register
> +
> +	mrs	x26, id_aa64dfr0_el1
> +	ubfx	x24, x26, #12, #4	// Extract BRPs
> +	ubfx	x25, x26, #20, #4	// Extract WRPs
> +	mov	w26, #15
> +	sub	w24, w26, w24		// How many BPs to skip
> +	sub	w25, w26, w25		// How many WPs to skip
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	mrs	x20, dbgbcr15_el1
> +	mrs	x19, dbgbcr14_el1
> +	mrs	x18, dbgbcr13_el1
> +	mrs	x17, dbgbcr12_el1
> +	mrs	x16, dbgbcr11_el1
> +	mrs	x15, dbgbcr10_el1
> +	mrs	x14, dbgbcr9_el1
> +	mrs	x13, dbgbcr8_el1
> +	mrs	x12, dbgbcr7_el1
> +	mrs	x11, dbgbcr6_el1
> +	mrs	x10, dbgbcr5_el1
> +	mrs	x9, dbgbcr4_el1
> +	mrs	x8, dbgbcr3_el1
> +	mrs	x7, dbgbcr2_el1
> +	mrs	x6, dbgbcr1_el1
> +	mrs	x5, dbgbcr0_el1
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +
> +1:
> +	str	x20, [x3, #(15 * 8)]
> +	str	x19, [x3, #(14 * 8)]
> +	str	x18, [x3, #(13 * 8)]
> +	str	x17, [x3, #(12 * 8)]
> +	str	x16, [x3, #(11 * 8)]
> +	str	x15, [x3, #(10 * 8)]
> +	str	x14, [x3, #(9 * 8)]
> +	str	x13, [x3, #(8 * 8)]
> +	str	x12, [x3, #(7 * 8)]
> +	str	x11, [x3, #(6 * 8)]
> +	str	x10, [x3, #(5 * 8)]
> +	str	x9, [x3, #(4 * 8)]
> +	str	x8, [x3, #(3 * 8)]
> +	str	x7, [x3, #(2 * 8)]
> +	str	x6, [x3, #(1 * 8)]
> +	str	x5, [x3, #(0 * 8)]
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	mrs	x20, dbgbvr15_el1
> +	mrs	x19, dbgbvr14_el1
> +	mrs	x18, dbgbvr13_el1
> +	mrs	x17, dbgbvr12_el1
> +	mrs	x16, dbgbvr11_el1
> +	mrs	x15, dbgbvr10_el1
> +	mrs	x14, dbgbvr9_el1
> +	mrs	x13, dbgbvr8_el1
> +	mrs	x12, dbgbvr7_el1
> +	mrs	x11, dbgbvr6_el1
> +	mrs	x10, dbgbvr5_el1
> +	mrs	x9, dbgbvr4_el1
> +	mrs	x8, dbgbvr3_el1
> +	mrs	x7, dbgbvr2_el1
> +	mrs	x6, dbgbvr1_el1
> +	mrs	x5, dbgbvr0_el1
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +
> +1:
> +	str	x20, [x3, #(15 * 8)]
> +	str	x19, [x3, #(14 * 8)]
> +	str	x18, [x3, #(13 * 8)]
> +	str	x17, [x3, #(12 * 8)]
> +	str	x16, [x3, #(11 * 8)]
> +	str	x15, [x3, #(10 * 8)]
> +	str	x14, [x3, #(9 * 8)]
> +	str	x13, [x3, #(8 * 8)]
> +	str	x12, [x3, #(7 * 8)]
> +	str	x11, [x3, #(6 * 8)]
> +	str	x10, [x3, #(5 * 8)]
> +	str	x9, [x3, #(4 * 8)]
> +	str	x8, [x3, #(3 * 8)]
> +	str	x7, [x3, #(2 * 8)]
> +	str	x6, [x3, #(1 * 8)]
> +	str	x5, [x3, #(0 * 8)]
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	mrs	x20, dbgwcr15_el1
> +	mrs	x19, dbgwcr14_el1
> +	mrs	x18, dbgwcr13_el1
> +	mrs	x17, dbgwcr12_el1
> +	mrs	x16, dbgwcr11_el1
> +	mrs	x15, dbgwcr10_el1
> +	mrs	x14, dbgwcr9_el1
> +	mrs	x13, dbgwcr8_el1
> +	mrs	x12, dbgwcr7_el1
> +	mrs	x11, dbgwcr6_el1
> +	mrs	x10, dbgwcr5_el1
> +	mrs	x9, dbgwcr4_el1
> +	mrs	x8, dbgwcr3_el1
> +	mrs	x7, dbgwcr2_el1
> +	mrs	x6, dbgwcr1_el1
> +	mrs	x5, dbgwcr0_el1
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +
> +1:
> +	str	x20, [x3, #(15 * 8)]
> +	str	x19, [x3, #(14 * 8)]
> +	str	x18, [x3, #(13 * 8)]
> +	str	x17, [x3, #(12 * 8)]
> +	str	x16, [x3, #(11 * 8)]
> +	str	x15, [x3, #(10 * 8)]
> +	str	x14, [x3, #(9 * 8)]
> +	str	x13, [x3, #(8 * 8)]
> +	str	x12, [x3, #(7 * 8)]
> +	str	x11, [x3, #(6 * 8)]
> +	str	x10, [x3, #(5 * 8)]
> +	str	x9, [x3, #(4 * 8)]
> +	str	x8, [x3, #(3 * 8)]
> +	str	x7, [x3, #(2 * 8)]
> +	str	x6, [x3, #(1 * 8)]
> +	str	x5, [x3, #(0 * 8)]
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	mrs	x20, dbgwvr15_el1
> +	mrs	x19, dbgwvr14_el1
> +	mrs	x18, dbgwvr13_el1
> +	mrs	x17, dbgwvr12_el1
> +	mrs	x16, dbgwvr11_el1
> +	mrs	x15, dbgwvr10_el1
> +	mrs	x14, dbgwvr9_el1
> +	mrs	x13, dbgwvr8_el1
> +	mrs	x12, dbgwvr7_el1
> +	mrs	x11, dbgwvr6_el1
> +	mrs	x10, dbgwvr5_el1
> +	mrs	x9, dbgwvr4_el1
> +	mrs	x8, dbgwvr3_el1
> +	mrs	x7, dbgwvr2_el1
> +	mrs	x6, dbgwvr1_el1
> +	mrs	x5, dbgwvr0_el1
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +
> +1:
> +	str	x20, [x3, #(15 * 8)]
> +	str	x19, [x3, #(14 * 8)]
> +	str	x18, [x3, #(13 * 8)]
> +	str	x17, [x3, #(12 * 8)]
> +	str	x16, [x3, #(11 * 8)]
> +	str	x15, [x3, #(10 * 8)]
> +	str	x14, [x3, #(9 * 8)]
> +	str	x13, [x3, #(8 * 8)]
> +	str	x12, [x3, #(7 * 8)]
> +	str	x11, [x3, #(6 * 8)]
> +	str	x10, [x3, #(5 * 8)]
> +	str	x9, [x3, #(4 * 8)]
> +	str	x8, [x3, #(3 * 8)]
> +	str	x7, [x3, #(2 * 8)]
> +	str	x6, [x3, #(1 * 8)]
> +	str	x5, [x3, #(0 * 8)]
> +
> +	mrs	x21, mdccint_el1
> +	str	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>  .endm
>  
>  .macro restore_sysregs
> @@ -245,7 +442,7 @@ __kvm_hyp_code_start:
>  	ldp	x18, x19, [x3, #112]
>  	ldp	x20, x21, [x3, #128]
>  	ldp	x22, x23, [x3, #144]
> -	ldr	x24, [x3, #160]
> +	ldp	x24, x25, [x3, #160]
>  
>  	msr	vmpidr_el2,	x4
>  	msr	csselr_el1,	x5
> @@ -268,6 +465,198 @@ __kvm_hyp_code_start:
>  	msr	amair_el1,	x22
>  	msr	cntkctl_el1,	x23
>  	msr	par_el1,	x24
> +	msr	mdscr_el1,	x25
> +.endm
> +
> +.macro restore_debug
> +	// x2: base address for cpu context
> +	// x3: tmp register
> +
> +	mrs	x26, id_aa64dfr0_el1
> +	ubfx	x24, x26, #12, #4	// Extract BRPs
> +	ubfx	x25, x26, #20, #4	// Extract WRPs
> +	mov	w26, #15
> +	sub	w24, w26, w24		// How many BPs to skip
> +	sub	w25, w26, w25		// How many WPs to skip
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	ldr	x20, [x3, #(15 * 8)]
> +	ldr	x19, [x3, #(14 * 8)]
> +	ldr	x18, [x3, #(13 * 8)]
> +	ldr	x17, [x3, #(12 * 8)]
> +	ldr	x16, [x3, #(11 * 8)]
> +	ldr	x15, [x3, #(10 * 8)]
> +	ldr	x14, [x3, #(9 * 8)]
> +	ldr	x13, [x3, #(8 * 8)]
> +	ldr	x12, [x3, #(7 * 8)]
> +	ldr	x11, [x3, #(6 * 8)]
> +	ldr	x10, [x3, #(5 * 8)]
> +	ldr	x9, [x3, #(4 * 8)]
> +	ldr	x8, [x3, #(3 * 8)]
> +	ldr	x7, [x3, #(2 * 8)]
> +	ldr	x6, [x3, #(1 * 8)]
> +	ldr	x5, [x3, #(0 * 8)]
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	msr	dbgbcr15_el1, x20
> +	msr	dbgbcr14_el1, x19
> +	msr	dbgbcr13_el1, x18
> +	msr	dbgbcr12_el1, x17
> +	msr	dbgbcr11_el1, x16
> +	msr	dbgbcr10_el1, x15
> +	msr	dbgbcr9_el1, x14
> +	msr	dbgbcr8_el1, x13
> +	msr	dbgbcr7_el1, x12
> +	msr	dbgbcr6_el1, x11
> +	msr	dbgbcr5_el1, x10
> +	msr	dbgbcr4_el1, x9
> +	msr	dbgbcr3_el1, x8
> +	msr	dbgbcr2_el1, x7
> +	msr	dbgbcr1_el1, x6
> +	msr	dbgbcr0_el1, x5
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	ldr	x20, [x3, #(15 * 8)]
> +	ldr	x19, [x3, #(14 * 8)]
> +	ldr	x18, [x3, #(13 * 8)]
> +	ldr	x17, [x3, #(12 * 8)]
> +	ldr	x16, [x3, #(11 * 8)]
> +	ldr	x15, [x3, #(10 * 8)]
> +	ldr	x14, [x3, #(9 * 8)]
> +	ldr	x13, [x3, #(8 * 8)]
> +	ldr	x12, [x3, #(7 * 8)]
> +	ldr	x11, [x3, #(6 * 8)]
> +	ldr	x10, [x3, #(5 * 8)]
> +	ldr	x9, [x3, #(4 * 8)]
> +	ldr	x8, [x3, #(3 * 8)]
> +	ldr	x7, [x3, #(2 * 8)]
> +	ldr	x6, [x3, #(1 * 8)]
> +	ldr	x5, [x3, #(0 * 8)]
> +
> +	adr	x26, 1f
> +	add	x26, x26, x24, lsl #2
> +	br	x26
> +1:
> +	msr	dbgbvr15_el1, x20
> +	msr	dbgbvr14_el1, x19
> +	msr	dbgbvr13_el1, x18
> +	msr	dbgbvr12_el1, x17
> +	msr	dbgbvr11_el1, x16
> +	msr	dbgbvr10_el1, x15
> +	msr	dbgbvr9_el1, x14
> +	msr	dbgbvr8_el1, x13
> +	msr	dbgbvr7_el1, x12
> +	msr	dbgbvr6_el1, x11
> +	msr	dbgbvr5_el1, x10
> +	msr	dbgbvr4_el1, x9
> +	msr	dbgbvr3_el1, x8
> +	msr	dbgbvr2_el1, x7
> +	msr	dbgbvr1_el1, x6
> +	msr	dbgbvr0_el1, x5
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	ldr	x20, [x3, #(15 * 8)]
> +	ldr	x19, [x3, #(14 * 8)]
> +	ldr	x18, [x3, #(13 * 8)]
> +	ldr	x17, [x3, #(12 * 8)]
> +	ldr	x16, [x3, #(11 * 8)]
> +	ldr	x15, [x3, #(10 * 8)]
> +	ldr	x14, [x3, #(9 * 8)]
> +	ldr	x13, [x3, #(8 * 8)]
> +	ldr	x12, [x3, #(7 * 8)]
> +	ldr	x11, [x3, #(6 * 8)]
> +	ldr	x10, [x3, #(5 * 8)]
> +	ldr	x9, [x3, #(4 * 8)]
> +	ldr	x8, [x3, #(3 * 8)]
> +	ldr	x7, [x3, #(2 * 8)]
> +	ldr	x6, [x3, #(1 * 8)]
> +	ldr	x5, [x3, #(0 * 8)]
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	msr	dbgwcr15_el1, x20
> +	msr	dbgwcr14_el1, x19
> +	msr	dbgwcr13_el1, x18
> +	msr	dbgwcr12_el1, x17
> +	msr	dbgwcr11_el1, x16
> +	msr	dbgwcr10_el1, x15
> +	msr	dbgwcr9_el1, x14
> +	msr	dbgwcr8_el1, x13
> +	msr	dbgwcr7_el1, x12
> +	msr	dbgwcr6_el1, x11
> +	msr	dbgwcr5_el1, x10
> +	msr	dbgwcr4_el1, x9
> +	msr	dbgwcr3_el1, x8
> +	msr	dbgwcr2_el1, x7
> +	msr	dbgwcr1_el1, x6
> +	msr	dbgwcr0_el1, x5
> +
> +	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	ldr	x20, [x3, #(15 * 8)]
> +	ldr	x19, [x3, #(14 * 8)]
> +	ldr	x18, [x3, #(13 * 8)]
> +	ldr	x17, [x3, #(12 * 8)]
> +	ldr	x16, [x3, #(11 * 8)]
> +	ldr	x15, [x3, #(10 * 8)]
> +	ldr	x14, [x3, #(9 * 8)]
> +	ldr	x13, [x3, #(8 * 8)]
> +	ldr	x12, [x3, #(7 * 8)]
> +	ldr	x11, [x3, #(6 * 8)]
> +	ldr	x10, [x3, #(5 * 8)]
> +	ldr	x9, [x3, #(4 * 8)]
> +	ldr	x8, [x3, #(3 * 8)]
> +	ldr	x7, [x3, #(2 * 8)]
> +	ldr	x6, [x3, #(1 * 8)]
> +	ldr	x5, [x3, #(0 * 8)]
> +
> +	adr	x26, 1f
> +	add	x26, x26, x25, lsl #2
> +	br	x26
> +1:
> +	msr	dbgwvr15_el1, x20
> +	msr	dbgwvr14_el1, x19
> +	msr	dbgwvr13_el1, x18
> +	msr	dbgwvr12_el1, x17
> +	msr	dbgwvr11_el1, x16
> +	msr	dbgwvr10_el1, x15
> +	msr	dbgwvr9_el1, x14
> +	msr	dbgwvr8_el1, x13
> +	msr	dbgwvr7_el1, x12
> +	msr	dbgwvr6_el1, x11
> +	msr	dbgwvr5_el1, x10
> +	msr	dbgwvr4_el1, x9
> +	msr	dbgwvr3_el1, x8
> +	msr	dbgwvr2_el1, x7
> +	msr	dbgwvr1_el1, x6
> +	msr	dbgwvr0_el1, x5
> +
> +	ldr	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
> +	msr	mdccint_el1, x21
>  .endm
>  
>  .macro skip_32bit_state tmp, target
> @@ -282,6 +671,35 @@ __kvm_hyp_code_start:
>  	tbz	\tmp, #12, \target
>  .endm
>  
> +.macro skip_debug_state tmp, target
> +	ldr	\tmp, [x0, #VCPU_DEBUG_FLAGS]
> +	tbz	\tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
> +.endm
> +
> +.macro compute_debug_state target
> +	// Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
> +	// is set, we do a full save/restore cycle and disable trapping.
> +	add	x25, x0, #VCPU_CONTEXT
> +
> +	// Check the state of MDSCR_EL1
> +	ldr	x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
> +	and	x26, x25, #DBG_MDSCR_KDE
> +	and	x25, x25, #DBG_MDSCR_MDE
> +	adds	xzr, x25, x26
> +	b.eq	9998f		// Nothing to see there
> +
> +	// If any interesting bits was set, we must set the flag
> +	mov	x26, #KVM_ARM64_DEBUG_DIRTY
> +	str	x26, [x0, #VCPU_DEBUG_FLAGS]
> +	b	9999f		// Don't skip restore
> +
> +9998:
> +	// Otherwise load the flags from memory in case we recently
> +	// trapped
> +	skip_debug_state x25, \target
> +9999:
> +.endm
> +
>  .macro save_guest_32bit_state
>  	skip_32bit_state x3, 1f
>  
> @@ -297,10 +715,13 @@ __kvm_hyp_code_start:
>  	mrs	x4, dacr32_el2
>  	mrs	x5, ifsr32_el2
>  	mrs	x6, fpexc32_el2
> -	mrs	x7, dbgvcr32_el2
>  	stp	x4, x5, [x3]
> -	stp	x6, x7, [x3, #16]
> +	str	x6, [x3, #16]
>  
> +	skip_debug_state x8, 2f
> +	mrs	x7, dbgvcr32_el2
> +	str	x7, [x3, #24]
> +2:
>  	skip_tee_state x8, 1f
>  
>  	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -323,12 +744,15 @@ __kvm_hyp_code_start:
>  
>  	add	x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>  	ldp	x4, x5, [x3]
> -	ldp	x6, x7, [x3, #16]
> +	ldr	x6, [x3, #16]
>  	msr	dacr32_el2, x4
>  	msr	ifsr32_el2, x5
>  	msr	fpexc32_el2, x6
> -	msr	dbgvcr32_el2, x7
>  
> +	skip_debug_state x8, 2f
> +	ldr	x7, [x3, #24]
> +	msr	dbgvcr32_el2, x7
> +2:
>  	skip_tee_state x8, 1f
>  
>  	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -537,6 +961,14 @@ __restore_sysregs:
>  	restore_sysregs
>  	ret
>  
> +__save_debug:
> +	save_debug
> +	ret
> +
> +__restore_debug:
> +	restore_debug
> +	ret
> +
>  __save_fpsimd:
>  	save_fpsimd
>  	ret
> @@ -568,6 +1000,9 @@ ENTRY(__kvm_vcpu_run)
>  	bl __save_fpsimd
>  	bl __save_sysregs
>  
> +	compute_debug_state 1f
> +	bl	__save_debug
> +1:
>  	activate_traps
>  	activate_vm
>  
> @@ -579,6 +1014,10 @@ ENTRY(__kvm_vcpu_run)
>  
>  	bl __restore_sysregs
>  	bl __restore_fpsimd
> +
> +	skip_debug_state x3, 1f
> +	bl	__restore_debug
> +1:
>  	restore_guest_32bit_state
>  	restore_guest_regs
>  
> @@ -595,6 +1034,10 @@ __kvm_vcpu_return:
>  	save_guest_regs
>  	bl __save_fpsimd
>  	bl __save_sysregs
> +
> +	skip_debug_state x3, 1f
> +	bl	__save_debug
> +1:
>  	save_guest_32bit_state
>  
>  	save_timer_state
> @@ -609,6 +1052,13 @@ __kvm_vcpu_return:
>  
>  	bl __restore_sysregs
>  	bl __restore_fpsimd
> +
> +	skip_debug_state x3, 1f
> +	// Clear the dirty flag for the next run, as all the state has
> +	// already been saved.
> +	str	xzr, [x0, #VCPU_DEBUG_FLAGS]
> +	bl	__restore_debug
> +1:
>  	restore_host_regs
>  
>  	mov	x0, x1
> -- 
> 1.8.3.4
> 

let's just try to remember the fact that we overwrite the entire bitmask
here if we add more bits to that value some time:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-07-09  9:38   ` Christoffer Dall
@ 2014-07-09 11:09     ` Marc Zyngier
  2014-07-09 14:52       ` Christoffer Dall
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Zyngier @ 2014-07-09 11:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 09 2014 at 10:38:13 am BST, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Fri, Jun 20, 2014 at 02:00:01PM +0100, Marc Zyngier wrote:
>> Add handlers for all the AArch64 debug registers that are accessible
>> from EL0 or EL1. The trapping code keeps track of the state of the
>> debug registers, allowing for the switch code to implement a lazy
>> switching strategy.
>>
>> Reviewed-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
>>  arch/arm64/include/asm/kvm_host.h |   3 +
>>  arch/arm64/kvm/sys_regs.c         | 137 +++++++++++++++++++++++++++++++++++++-
>>  3 files changed, 159 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 9fcd54b..e6b159a 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -43,14 +43,25 @@
>>  #define      AMAIR_EL1       19      /* Aux Memory Attribute Indirection Register */
>>  #define      CNTKCTL_EL1     20      /* Timer Control Register (EL1) */
>>  #define      PAR_EL1         21      /* Physical Address Register */
>> +#define MDSCR_EL1    22      /* Monitor Debug System Control Register */
>> +#define DBGBCR0_EL1  23      /* Debug Breakpoint Control Registers (0-15) */
>> +#define DBGBCR15_EL1 38
>> +#define DBGBVR0_EL1  39      /* Debug Breakpoint Value Registers (0-15) */
>> +#define DBGBVR15_EL1 54
>> +#define DBGWCR0_EL1  55      /* Debug Watchpoint Control Registers (0-15) */
>> +#define DBGWCR15_EL1 70
>> +#define DBGWVR0_EL1  71      /* Debug Watchpoint Value Registers (0-15) */
>> +#define DBGWVR15_EL1 86
>> +#define MDCCINT_EL1  87      /* Monitor Debug Comms Channel Interrupt Enable Reg */
>> +
>>  /* 32bit specific registers. Keep them at the end of the range */
>> -#define      DACR32_EL2      22      /* Domain Access Control Register */
>> -#define      IFSR32_EL2      23      /* Instruction Fault Status Register */
>> -#define      FPEXC32_EL2     24      /* Floating-Point Exception Control Register */
>> -#define      DBGVCR32_EL2    25      /* Debug Vector Catch Register */
>> -#define      TEECR32_EL1     26      /* ThumbEE Configuration Register */
>> -#define      TEEHBR32_EL1    27      /* ThumbEE Handler Base Register */
>> -#define      NR_SYS_REGS     28
>> +#define      DACR32_EL2      88      /* Domain Access Control Register */
>> +#define      IFSR32_EL2      89      /* Instruction Fault Status Register */
>> +#define      FPEXC32_EL2     90      /* Floating-Point Exception Control Register */
>> +#define      DBGVCR32_EL2    91      /* Debug Vector Catch Register */
>> +#define      TEECR32_EL1     92      /* ThumbEE Configuration Register */
>> +#define      TEEHBR32_EL1    93      /* ThumbEE Handler Base Register */
>> +#define      NR_SYS_REGS     94
>>
>>  /* 32bit mapping */
>>  #define c0_MPIDR     (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
>> @@ -87,6 +98,9 @@
>>  #define ARM_EXCEPTION_IRQ      0
>>  #define ARM_EXCEPTION_TRAP     1
>>
>> +#define KVM_ARM64_DEBUG_DIRTY_SHIFT  0
>> +#define KVM_ARM64_DEBUG_DIRTY                (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
>> +
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 92242ce..79573c86 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
>>       /* Exception Information */
>>       struct kvm_vcpu_fault_info fault;
>>
>> +     /* Debug state */
>> +     u64 debug_flags;
>> +
>>       /* Pointer to host CPU context */
>>       kvm_cpu_context_t *host_cpu_context;
>>
>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>> index 4abd84e..808e3b2 100644
>> --- a/arch/arm64/kvm/sys_regs.c
>> +++ b/arch/arm64/kvm/sys_regs.c
>> @@ -30,6 +30,7 @@
>>  #include <asm/kvm_mmu.h>
>>  #include <asm/cacheflush.h>
>>  #include <asm/cputype.h>
>> +#include <asm/debug-monitors.h>
>>  #include <trace/events/kvm.h>
>>
>>  #include "sys_regs.h"
>> @@ -173,6 +174,60 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
>>               return read_zero(vcpu, p);
>>  }
>>
>> +static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
>> +                        const struct sys_reg_params *p,
>> +                        const struct sys_reg_desc *r)
>> +{
>> +     if (p->is_write) {
>> +             return ignore_write(vcpu, p);
>> +     } else {
>> +             *vcpu_reg(vcpu, p->Rt) = (1 << 3);
>> +             return true;
>> +     }
>> +}
>> +
>> +static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
>> +                                const struct sys_reg_params *p,
>> +                                const struct sys_reg_desc *r)
>> +{
>> +     if (p->is_write) {
>> +             return ignore_write(vcpu, p);
>> +     } else {
>> +             u32 val;
>> +             asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
>> +             *vcpu_reg(vcpu, p->Rt) = val;
>> +             return true;
>> +     }
>> +}
>> +
>> +/*
>> + * We want to avoid world-switching all the DBG registers all the
>> + * time. For this, we use a DIRTY but, indicating the guest has
>
> a DIRTY but?  (at least there's only one t in there).

The whole debug architecture makes me feel very dirty.

>> + * modified the debug registers, and only restore the registers once,
>> + * disabling traps.
>
> I don't think I understand the "only restore the registers once" bit
> here.  I know I'm being incredibly stupid, but I forgot since the last
> review round how this actually works; when we return from the guest and
> the guest has somehow enabled certain DBG functionality, then we set the dirty
> flag, which means we should stop trapping and context switch all the
> registers on world-switches, but if we see when returning from the guest
> that the guest doesn't appear to be using the registers we enable
> trapping and stop world-switching, right?

Almost. We always decide on the trapping when entering the guest:
- If the dirty bit is set (because we're coming back from trapping),
  disable the traps, restore the registers
- If debug is actively in use (DBG_MDSCR_KDE or DBG_MDSCR_MDE set),
  disable the traps, restore the registers
- Otherwise, enable the traps

When exiting the guest: If the dirty bit is set, save the registers and
clear the dirty bit.

> Do we clearly define which state triggers the world-switching and why
> that's a good rationale? (sorry, the debug architecture is not my
> favorite part of the ARM ARM).

I thing the above comment describes the state precisely. My rational is:
- If we've touched any debug register, it is likely that we're going to
  touch more of them. It then makes sense to disable the traps and start
  doing the save/restore dance
- If debug is active (DBG_MDSCR_KDE or DBG_MDSCR_MDE set), it is then
  mandatory to save/restore the registers, as the guest depends on them.

Does this make the process clearer? If so, I can add it to the comment.

>> + *
>> + * The best thing to do would be to trap MDSCR_EL1 independently, test
>> + * if DBG_MDSCR_KDE or DBG_MDSCR_MDE is getting set, and only set the
>> + * DIRTY bit in that case.
>> + *
>> + * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1 like a
>> + * woodpecker on a tree, and it is better to disable trapping as soon
>> + * as possible in this case. Some day, make this a tuneable...
>> + */
>> +static bool trap_debug_regs(struct kvm_vcpu *vcpu,
>> +                         const struct sys_reg_params *p,
>> +                         const struct sys_reg_desc *r)
>> +{
>> +     if (p->is_write) {
>> +             vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
>> +             vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
>> +     } else {
>> +             *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
>> +     }
>> +
>> +     return true;
>> +}
>> +
>>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>>  {
>>       u64 amair;
>> @@ -189,6 +244,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>>       vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
>>  }
>>
>> +/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
>> +#define DBG_BCR_BVR_WCR_WVR_EL1(n)                                   \
>> +     /* DBGBVRn_EL1 */                                               \
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),     \
>> +       trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },         \
>> +     /* DBGBCRn_EL1 */                                               \
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),     \
>> +       trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },         \
>> +     /* DBGWVRn_EL1 */                                               \
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),     \
>> +       trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 },         \
>> +     /* DBGWCRn_EL1 */                                               \
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),     \
>> +       trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 }
>> +
>>  /*
>>   * Architected system registers.
>>   * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
>> @@ -201,8 +271,12 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>>   * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>>   * all PM registers, which doesn't crash the guest kernel at least.
>>   *
>> - * Same goes for the whole debug infrastructure, which probably breaks
>> - * some guest functionnality. This should be fixed.
>> + * Debug handling: We do trap most, if not all debug related system
>> + * registers. The implementation is good enough to ensure that a guest
>> + * can use these with minimal performance degradation. The drawback is
>> + * that we don't implement any of the external debug, none of the
>> + * OSlock protocol. This should be revisited if we ever encounter a
>> + * more demanding guest...
>>   */
>>  static const struct sys_reg_desc sys_reg_descs[] = {
>>       /* DC ISW */
>> @@ -215,12 +289,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>>       { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
>>         access_dcsw },
>>
>> +     DBG_BCR_BVR_WCR_WVR_EL1(0),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(1),
>> +     /* MDCCINT_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
>> +       trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
>> +     /* MDSCR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
>> +       trap_debug_regs, reset_val, MDSCR_EL1, 0 },
>> +     DBG_BCR_BVR_WCR_WVR_EL1(2),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(3),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(4),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(5),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(6),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(7),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(8),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(9),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(10),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(11),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(12),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(13),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(14),
>> +     DBG_BCR_BVR_WCR_WVR_EL1(15),
>> +
>> +     /* MDRAR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
>> +       trap_raz_wi },
>> +     /* OSLAR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
>> +       trap_raz_wi },
>> +     /* OSLSR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
>> +       trap_oslsr_el1 },
>> +     /* OSDLR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
>> +       trap_raz_wi },
>> +     /* DBGPRCR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
>> +       trap_raz_wi },
>> +     /* DBGCLAIMSET_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
>> +       trap_raz_wi },
>> +     /* DBGCLAIMCLR_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
>> +       trap_raz_wi },
>> +     /* DBGAUTHSTATUS_EL1 */
>> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
>> +       trap_dbgauthstatus_el1 },
>> +
>>       /* TEECR32_EL1 */
>>       { Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
>>         NULL, reset_val, TEECR32_EL1, 0 },
>>       /* TEEHBR32_EL1 */
>>       { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
>>         NULL, reset_val, TEEHBR32_EL1, 0 },
>> +
>> +     /* MDCCSR_EL1 */
>> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
>> +       trap_raz_wi },
>> +     /* DBGDTR_EL0 */
>> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
>> +       trap_raz_wi },
>> +     /* DBGDTR[TR]X_EL0 */
>> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
>> +       trap_raz_wi },
>> +
>>       /* DBGVCR32_EL2 */
>>       { Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
>>         NULL, reset_val, DBGVCR32_EL2, 0 },
>> --
>> 1.8.3.4
>>
>
> Besides the commenting stuff above:
>
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-07-09  9:45   ` Christoffer Dall
@ 2014-07-09 11:18     ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-07-09 11:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 09 2014 at 10:45:05 am BST, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Fri, Jun 20, 2014 at 02:00:06PM +0100, Marc Zyngier wrote:
>> Implement switching of the debug registers. While the number
>> of registers is massive, CPUs usually don't implement them all
>> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
>> of 22 registers "only").
>>
>> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
>> or when we've flagged the debug registers as dirty. It means that
>> most of the time, we only save/restore MDSCR_EL1.
>>
>> Reviewed-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kernel/asm-offsets.c |   1 +
>>  arch/arm64/kvm/hyp.S            | 462 +++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 457 insertions(+), 6 deletions(-)
>>

[...]

>> @@ -609,6 +1052,13 @@ __kvm_vcpu_return:
>>
>>       bl __restore_sysregs
>>       bl __restore_fpsimd
>> +
>> +     skip_debug_state x3, 1f
>> +     // Clear the dirty flag for the next run, as all the state has
>> +     // already been saved.
>> +     str     xzr, [x0, #VCPU_DEBUG_FLAGS]
>> +     bl      __restore_debug
>> +1:
>>       restore_host_regs
>>
>>       mov     x0, x1
>> --
>> 1.8.3.4
>>
>
> let's just try to remember the fact that we overwrite the entire bitmask
> here if we add more bits to that value some time:

I'll add a comment to that effect.

> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

Thanks!

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-07-09 11:09     ` Marc Zyngier
@ 2014-07-09 14:52       ` Christoffer Dall
  2014-07-09 16:20         ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2014-07-09 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 09, 2014 at 12:09:29PM +0100, Marc Zyngier wrote:
> On Wed, Jul 09 2014 at 10:38:13 am BST, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Fri, Jun 20, 2014 at 02:00:01PM +0100, Marc Zyngier wrote:
> >> Add handlers for all the AArch64 debug registers that are accessible
> >> from EL0 or EL1. The trapping code keeps track of the state of the
> >> debug registers, allowing for the switch code to implement a lazy
> >> switching strategy.
> >>
> >> Reviewed-by: Anup Patel <anup.patel@linaro.org>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
> >>  arch/arm64/include/asm/kvm_host.h |   3 +
> >>  arch/arm64/kvm/sys_regs.c         | 137 +++++++++++++++++++++++++++++++++++++-
> >>  3 files changed, 159 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >> index 9fcd54b..e6b159a 100644
> >> --- a/arch/arm64/include/asm/kvm_asm.h
> >> +++ b/arch/arm64/include/asm/kvm_asm.h
> >> @@ -43,14 +43,25 @@
> >>  #define      AMAIR_EL1       19      /* Aux Memory Attribute Indirection Register */
> >>  #define      CNTKCTL_EL1     20      /* Timer Control Register (EL1) */
> >>  #define      PAR_EL1         21      /* Physical Address Register */
> >> +#define MDSCR_EL1    22      /* Monitor Debug System Control Register */
> >> +#define DBGBCR0_EL1  23      /* Debug Breakpoint Control Registers (0-15) */
> >> +#define DBGBCR15_EL1 38
> >> +#define DBGBVR0_EL1  39      /* Debug Breakpoint Value Registers (0-15) */
> >> +#define DBGBVR15_EL1 54
> >> +#define DBGWCR0_EL1  55      /* Debug Watchpoint Control Registers (0-15) */
> >> +#define DBGWCR15_EL1 70
> >> +#define DBGWVR0_EL1  71      /* Debug Watchpoint Value Registers (0-15) */
> >> +#define DBGWVR15_EL1 86
> >> +#define MDCCINT_EL1  87      /* Monitor Debug Comms Channel Interrupt Enable Reg */
> >> +
> >>  /* 32bit specific registers. Keep them at the end of the range */
> >> -#define      DACR32_EL2      22      /* Domain Access Control Register */
> >> -#define      IFSR32_EL2      23      /* Instruction Fault Status Register */
> >> -#define      FPEXC32_EL2     24      /* Floating-Point Exception Control Register */
> >> -#define      DBGVCR32_EL2    25      /* Debug Vector Catch Register */
> >> -#define      TEECR32_EL1     26      /* ThumbEE Configuration Register */
> >> -#define      TEEHBR32_EL1    27      /* ThumbEE Handler Base Register */
> >> -#define      NR_SYS_REGS     28
> >> +#define      DACR32_EL2      88      /* Domain Access Control Register */
> >> +#define      IFSR32_EL2      89      /* Instruction Fault Status Register */
> >> +#define      FPEXC32_EL2     90      /* Floating-Point Exception Control Register */
> >> +#define      DBGVCR32_EL2    91      /* Debug Vector Catch Register */
> >> +#define      TEECR32_EL1     92      /* ThumbEE Configuration Register */
> >> +#define      TEEHBR32_EL1    93      /* ThumbEE Handler Base Register */
> >> +#define      NR_SYS_REGS     94
> >>
> >>  /* 32bit mapping */
> >>  #define c0_MPIDR     (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
> >> @@ -87,6 +98,9 @@
> >>  #define ARM_EXCEPTION_IRQ      0
> >>  #define ARM_EXCEPTION_TRAP     1
> >>
> >> +#define KVM_ARM64_DEBUG_DIRTY_SHIFT  0
> >> +#define KVM_ARM64_DEBUG_DIRTY                (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> >> +
> >>  #ifndef __ASSEMBLY__
> >>  struct kvm;
> >>  struct kvm_vcpu;
> >> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >> index 92242ce..79573c86 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
> >>       /* Exception Information */
> >>       struct kvm_vcpu_fault_info fault;
> >>
> >> +     /* Debug state */
> >> +     u64 debug_flags;
> >> +
> >>       /* Pointer to host CPU context */
> >>       kvm_cpu_context_t *host_cpu_context;
> >>
> >> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> >> index 4abd84e..808e3b2 100644
> >> --- a/arch/arm64/kvm/sys_regs.c
> >> +++ b/arch/arm64/kvm/sys_regs.c
> >> @@ -30,6 +30,7 @@
> >>  #include <asm/kvm_mmu.h>
> >>  #include <asm/cacheflush.h>
> >>  #include <asm/cputype.h>
> >> +#include <asm/debug-monitors.h>
> >>  #include <trace/events/kvm.h>
> >>
> >>  #include "sys_regs.h"
> >> @@ -173,6 +174,60 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
> >>               return read_zero(vcpu, p);
> >>  }
> >>
> >> +static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
> >> +                        const struct sys_reg_params *p,
> >> +                        const struct sys_reg_desc *r)
> >> +{
> >> +     if (p->is_write) {
> >> +             return ignore_write(vcpu, p);
> >> +     } else {
> >> +             *vcpu_reg(vcpu, p->Rt) = (1 << 3);
> >> +             return true;
> >> +     }
> >> +}
> >> +
> >> +static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
> >> +                                const struct sys_reg_params *p,
> >> +                                const struct sys_reg_desc *r)
> >> +{
> >> +     if (p->is_write) {
> >> +             return ignore_write(vcpu, p);
> >> +     } else {
> >> +             u32 val;
> >> +             asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
> >> +             *vcpu_reg(vcpu, p->Rt) = val;
> >> +             return true;
> >> +     }
> >> +}
> >> +
> >> +/*
> >> + * We want to avoid world-switching all the DBG registers all the
> >> + * time. For this, we use a DIRTY but, indicating the guest has
> >
> > a DIRTY but?  (at least there's only one t in there).
> 
> The whole debug architecture makes me feel very dirty.
> 
> >> + * modified the debug registers, and only restore the registers once,
> >> + * disabling traps.
> >
> > I don't think I understand the "only restore the registers once" bit
> > here.  I know I'm being incredibly stupid, but I forgot since the last
> > review round how this actually works; when we return from the guest and
> > the guest has somehow enabled certain DBG functionality, then we set the dirty
> > flag, which means we should stop trapping and context switch all the
> > registers on world-switches, but if we see when returning from the guest
> > that the guest doesn't appear to be using the registers we enable
> > trapping and stop world-switching, right?
> 
> Almost. We always decide on the trapping when entering the guest:
> - If the dirty bit is set (because we're coming back from trapping),
>   disable the traps, restore the registers
> - If debug is actively in use (DBG_MDSCR_KDE or DBG_MDSCR_MDE set),
>   disable the traps, restore the registers

this also sets the dirty bit then?

> - Otherwise, enable the traps
> 
> When exiting the guest: If the dirty bit is set, save the registers and
> clear the dirty bit.

because the host should always be able to freely use the debug registers
so we always have to restore the host registers on coming out of the VM,
right?

> 
> > Do we clearly define which state triggers the world-switching and why
> > that's a good rationale? (sorry, the debug architecture is not my
> > favorite part of the ARM ARM).
> 
> I thing the above comment describes the state precisely. My rational is:
> - If we've touched any debug register, it is likely that we're going to
>   touch more of them. It then makes sense to disable the traps and start
>   doing the save/restore dance
> - If debug is active (DBG_MDSCR_KDE or DBG_MDSCR_MDE set), it is then
>   mandatory to save/restore the registers, as the guest depends on them.
> 
> Does this make the process clearer? If so, I can add it to the comment.
> 

yes, the above comments help a lot.  thanks!!

[if I don't see your response because you already left for vacation, I'm
going to incorporate your comments in the patches to apply to
kvmarm/next].

> >> + *
> >> + * The best thing to do would be to trap MDSCR_EL1 independently, test
> >> + * if DBG_MDSCR_KDE or DBG_MDSCR_MDE is getting set, and only set the
> >> + * DIRTY bit in that case.
> >> + *
> >> + * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1 like a
> >> + * woodpecker on a tree, and it is better to disable trapping as soon
> >> + * as possible in this case. Some day, make this a tuneable...
> >> + */
> >> +static bool trap_debug_regs(struct kvm_vcpu *vcpu,
> >> +                         const struct sys_reg_params *p,
> >> +                         const struct sys_reg_desc *r)
> >> +{
> >> +     if (p->is_write) {
> >> +             vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> >> +             vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> >> +     } else {
> >> +             *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
> >> +     }
> >> +
> >> +     return true;
> >> +}
> >> +
> >>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> >>  {
> >>       u64 amair;
> >> @@ -189,6 +244,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> >>       vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
> >>  }
> >>
> >> +/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
> >> +#define DBG_BCR_BVR_WCR_WVR_EL1(n)                                   \
> >> +     /* DBGBVRn_EL1 */                                               \
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),     \
> >> +       trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },         \
> >> +     /* DBGBCRn_EL1 */                                               \
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),     \
> >> +       trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },         \
> >> +     /* DBGWVRn_EL1 */                                               \
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),     \
> >> +       trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 },         \
> >> +     /* DBGWCRn_EL1 */                                               \
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),     \
> >> +       trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 }
> >> +
> >>  /*
> >>   * Architected system registers.
> >>   * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> >> @@ -201,8 +271,12 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> >>   * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
> >>   * all PM registers, which doesn't crash the guest kernel at least.
> >>   *
> >> - * Same goes for the whole debug infrastructure, which probably breaks
> >> - * some guest functionnality. This should be fixed.
> >> + * Debug handling: We do trap most, if not all debug related system
> >> + * registers. The implementation is good enough to ensure that a guest
> >> + * can use these with minimal performance degradation. The drawback is
> >> + * that we don't implement any of the external debug, none of the
> >> + * OSlock protocol. This should be revisited if we ever encounter a
> >> + * more demanding guest...
> >>   */
> >>  static const struct sys_reg_desc sys_reg_descs[] = {
> >>       /* DC ISW */
> >> @@ -215,12 +289,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >>       { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
> >>         access_dcsw },
> >>
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(0),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(1),
> >> +     /* MDCCINT_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
> >> +       trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
> >> +     /* MDSCR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
> >> +       trap_debug_regs, reset_val, MDSCR_EL1, 0 },
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(2),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(3),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(4),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(5),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(6),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(7),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(8),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(9),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(10),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(11),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(12),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(13),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(14),
> >> +     DBG_BCR_BVR_WCR_WVR_EL1(15),
> >> +
> >> +     /* MDRAR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
> >> +       trap_raz_wi },
> >> +     /* OSLAR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
> >> +       trap_raz_wi },
> >> +     /* OSLSR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
> >> +       trap_oslsr_el1 },
> >> +     /* OSDLR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
> >> +       trap_raz_wi },
> >> +     /* DBGPRCR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
> >> +       trap_raz_wi },
> >> +     /* DBGCLAIMSET_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
> >> +       trap_raz_wi },
> >> +     /* DBGCLAIMCLR_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
> >> +       trap_raz_wi },
> >> +     /* DBGAUTHSTATUS_EL1 */
> >> +     { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
> >> +       trap_dbgauthstatus_el1 },
> >> +
> >>       /* TEECR32_EL1 */
> >>       { Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
> >>         NULL, reset_val, TEECR32_EL1, 0 },
> >>       /* TEEHBR32_EL1 */
> >>       { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
> >>         NULL, reset_val, TEEHBR32_EL1, 0 },
> >> +
> >> +     /* MDCCSR_EL1 */
> >> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
> >> +       trap_raz_wi },
> >> +     /* DBGDTR_EL0 */
> >> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
> >> +       trap_raz_wi },
> >> +     /* DBGDTR[TR]X_EL0 */
> >> +     { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
> >> +       trap_raz_wi },
> >> +
> >>       /* DBGVCR32_EL2 */
> >>       { Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
> >>         NULL, reset_val, DBGVCR32_EL2, 0 },
> >> --
> >> 1.8.3.4
> >>
> >
> > Besides the commenting stuff above:
> >
> > Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> 
> Thanks,
> 
> 	M.
> -- 
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-07-09 14:52       ` Christoffer Dall
@ 2014-07-09 16:20         ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2014-07-09 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 09 2014 at  3:52:32 pm BST, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Wed, Jul 09, 2014 at 12:09:29PM +0100, Marc Zyngier wrote:
>> On Wed, Jul 09 2014 at 10:38:13 am BST, Christoffer Dall
>> <christoffer.dall@linaro.org> wrote:
>> > On Fri, Jun 20, 2014 at 02:00:01PM +0100, Marc Zyngier wrote:
>> >> Add handlers for all the AArch64 debug registers that are accessible
>> >> from EL0 or EL1. The trapping code keeps track of the state of the
>> >> debug registers, allowing for the switch code to implement a lazy
>> >> switching strategy.
>> >>
>> >> Reviewed-by: Anup Patel <anup.patel@linaro.org>
>> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> >> ---
>> >>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
>> >>  arch/arm64/include/asm/kvm_host.h |   3 +
>> >>  arch/arm64/kvm/sys_regs.c         | 137 +++++++++++++++++++++++++++++++++++++-
>> >>  3 files changed, 159 insertions(+), 9 deletions(-)

[...]

>> >> +/*
>> >> + * We want to avoid world-switching all the DBG registers all the
>> >> + * time. For this, we use a DIRTY but, indicating the guest has
>> >
>> > a DIRTY but?  (at least there's only one t in there).
>>
>> The whole debug architecture makes me feel very dirty.
>>
>> >> + * modified the debug registers, and only restore the registers once,
>> >> + * disabling traps.
>> >
>> > I don't think I understand the "only restore the registers once" bit
>> > here.  I know I'm being incredibly stupid, but I forgot since the last
>> > review round how this actually works; when we return from the guest and
>> > the guest has somehow enabled certain DBG functionality, then we
>> > set the dirty
>> > flag, which means we should stop trapping and context switch all the
>> > registers on world-switches, but if we see when returning from the guest
>> > that the guest doesn't appear to be using the registers we enable
>> > trapping and stop world-switching, right?
>>
>> Almost. We always decide on the trapping when entering the guest:
>> - If the dirty bit is set (because we're coming back from trapping),
>>   disable the traps, restore the registers
>> - If debug is actively in use (DBG_MDSCR_KDE or DBG_MDSCR_MDE set),
>>   disable the traps, restore the registers
>
> this also sets the dirty bit then?

Indeed. I'll mention it.

>
>> - Otherwise, enable the traps
>>
>> When exiting the guest: If the dirty bit is set, save the registers and
>> clear the dirty bit.
>
> because the host should always be able to freely use the debug registers
> so we always have to restore the host registers on coming out of the VM,
> right?

Yes. The host may have its own debug state active, and we want to
preserve that.

>>
>> > Do we clearly define which state triggers the world-switching and why
>> > that's a good rationale? (sorry, the debug architecture is not my
>> > favorite part of the ARM ARM).
>>
>> I thing the above comment describes the state precisely. My rational is:
>> - If we've touched any debug register, it is likely that we're going to
>>   touch more of them. It then makes sense to disable the traps and start
>>   doing the save/restore dance
>> - If debug is active (DBG_MDSCR_KDE or DBG_MDSCR_MDE set), it is then
>>   mandatory to save/restore the registers, as the guest depends on them.
>>
>> Does this make the process clearer? If so, I can add it to the comment.
>>
>
> yes, the above comments help a lot.  thanks!!
>
> [if I don't see your response because you already left for vacation, I'm
> going to incorporate your comments in the patches to apply to
> kvmarm/next].

I'm not quite gone yet! ;-) Just enough time left to respin the branch
on top of what's already queued, push it out and post the series.

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-07-09 16:20 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-20 12:59 [PATCH v3 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
2014-06-20 12:59 ` [PATCH v3 1/9] arm64: KVM: rename pm_fake handler to trap_raz_wi Marc Zyngier
2014-07-09  9:27   ` Christoffer Dall
2014-07-09  9:36     ` Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 3/9] arm64: KVM: add trap handlers for AArch64 debug registers Marc Zyngier
2014-07-09  9:38   ` Christoffer Dall
2014-07-09 11:09     ` Marc Zyngier
2014-07-09 14:52       ` Christoffer Dall
2014-07-09 16:20         ` Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15 Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 6/9] arm64: KVM: check ordering of all system register tables Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 7/9] arm64: KVM: add trap handlers for AArch32 debug registers Marc Zyngier
2014-07-09  9:43   ` Christoffer Dall
2014-06-20 13:00 ` [PATCH v3 8/9] arm64: KVM: implement lazy world switch for " Marc Zyngier
2014-07-09  9:45   ` Christoffer Dall
2014-07-09 11:18     ` Marc Zyngier
2014-06-20 13:00 ` [PATCH v3 9/9] arm64: KVM: enable trapping of all " Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).