All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-07 15:20 ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Catalin Marinas, Will Deacon, Ian Campbell, Christoffer Dall

This patch series adds debug support, a key feature missing from the
KVM/arm64 port.

The main idea is to keep track of whether the debug registers are
"dirty" (changed by the guest) or not. In this case, perform the usual
save/restore dance, for one run only. It means we only have a penalty
if a guest is actually using the debug registers.

The huge amount of registers is properly frightening, but CPUs
actually only implement a subset of them. Also, there is a number of
registers we don't bother emulating (things having to do with external
debug and OSlock).

This has been tested on a Cortex-A57 platform, running both 32 and
64bit guests, on top of 3.15-rc4. This code also lives in my tree in
the kvm-arm64/debug-trap branch.

Marc Zyngier (9):
  arm64: KVM: rename pm_fake handler to trap_wi_raz
  arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  arm64: KVM: add trap handlers for AArch64 debug registers
  arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  arm64: KVM: check ordering of all system register tables
  arm64: KVM: add trap handlers for AArch32 debug registers
  arm64: KVM: implement lazy world switch for debug registers
  arm64: KVM: enable trapping of all debug registers

 arch/arm64/include/asm/debug-monitors.h |  19 +-
 arch/arm64/include/asm/kvm_asm.h        |  39 ++-
 arch/arm64/include/asm/kvm_coproc.h     |   3 +-
 arch/arm64/include/asm/kvm_host.h       |  12 +-
 arch/arm64/kernel/asm-offsets.c         |   1 +
 arch/arm64/kernel/debug-monitors.c      |   9 -
 arch/arm64/kvm/handle_exit.c            |   4 +-
 arch/arm64/kvm/hyp.S                    | 457 ++++++++++++++++++++++++++++-
 arch/arm64/kvm/sys_regs.c               | 494 +++++++++++++++++++++++++++-----
 9 files changed, 940 insertions(+), 98 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-07 15:20 ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

This patch series adds debug support, a key feature missing from the
KVM/arm64 port.

The main idea is to keep track of whether the debug registers are
"dirty" (changed by the guest) or not. In this case, perform the usual
save/restore dance, for one run only. It means we only have a penalty
if a guest is actually using the debug registers.

The huge amount of registers is properly frightening, but CPUs
actually only implement a subset of them. Also, there is a number of
registers we don't bother emulating (things having to do with external
debug and OSlock).

This has been tested on a Cortex-A57 platform, running both 32 and
64bit guests, on top of 3.15-rc4. This code also lives in my tree in
the kvm-arm64/debug-trap branch.

Marc Zyngier (9):
  arm64: KVM: rename pm_fake handler to trap_wi_raz
  arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  arm64: KVM: add trap handlers for AArch64 debug registers
  arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  arm64: KVM: check ordering of all system register tables
  arm64: KVM: add trap handlers for AArch32 debug registers
  arm64: KVM: implement lazy world switch for debug registers
  arm64: KVM: enable trapping of all debug registers

 arch/arm64/include/asm/debug-monitors.h |  19 +-
 arch/arm64/include/asm/kvm_asm.h        |  39 ++-
 arch/arm64/include/asm/kvm_coproc.h     |   3 +-
 arch/arm64/include/asm/kvm_host.h       |  12 +-
 arch/arm64/kernel/asm-offsets.c         |   1 +
 arch/arm64/kernel/debug-monitors.c      |   9 -
 arch/arm64/kvm/handle_exit.c            |   4 +-
 arch/arm64/kvm/hyp.S                    | 457 ++++++++++++++++++++++++++++-
 arch/arm64/kvm/sys_regs.c               | 494 +++++++++++++++++++++++++++-----
 9 files changed, 940 insertions(+), 98 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 83 ++++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0324458..fc8d4e3 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -163,18 +163,9 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
 	return true;
 }
 
-/*
- * We could trap ID_DFR0 and tell the guest we don't support performance
- * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
- * NAKed, so it will read the PMCR anyway.
- *
- * Therefore we tell the guest we have 0 counters.  Unfortunately, we
- * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
- * all PM registers, which doesn't crash the guest kernel at least.
- */
-static bool pm_fake(struct kvm_vcpu *vcpu,
-		    const struct sys_reg_params *p,
-		    const struct sys_reg_desc *r)
+static bool trap_wi_raz(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
 {
 	if (p->is_write)
 		return ignore_write(vcpu, p);
@@ -201,6 +192,17 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * We could trap ID_DFR0 and tell the guest we don't support performance
+ * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
+ * NAKed, so it will read the PMCR anyway.
+ *
+ * Therefore we tell the guest we have 0 counters.  Unfortunately, we
+ * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
+ * all PM registers, which doesn't crash the guest kernel at least.
+ *
+ * Same goes for the whole debug infrastructure, which probably breaks
+ * some guest functionnality. This should be fixed.
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -260,10 +262,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMINTENSET_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMINTENCLR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 
 	/* MAIR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000),
@@ -292,43 +294,43 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMCR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCNTENSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCNTENCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMOVSCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b011),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMSWINC_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b100),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMSELR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b101),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCEID0_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b110),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCEID1_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b111),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMXEVTYPER_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMXEVCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMUSERENR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMOVSSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b011),
-	  pm_fake },
+	  trap_wi_raz },
 
 	/* TPIDR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b0000), Op2(0b010),
@@ -374,19 +376,20 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw },
 	{ Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw },
 
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), pm_fake },
+	/* PMU */
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), trap_wi_raz },
 
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 0), access_vm_reg, NULL, c10_PRRR },
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 1), access_vm_reg, NULL, c10_NMRR },
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 83 ++++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0324458..fc8d4e3 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -163,18 +163,9 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
 	return true;
 }
 
-/*
- * We could trap ID_DFR0 and tell the guest we don't support performance
- * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
- * NAKed, so it will read the PMCR anyway.
- *
- * Therefore we tell the guest we have 0 counters.  Unfortunately, we
- * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
- * all PM registers, which doesn't crash the guest kernel at least.
- */
-static bool pm_fake(struct kvm_vcpu *vcpu,
-		    const struct sys_reg_params *p,
-		    const struct sys_reg_desc *r)
+static bool trap_wi_raz(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
 {
 	if (p->is_write)
 		return ignore_write(vcpu, p);
@@ -201,6 +192,17 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * We could trap ID_DFR0 and tell the guest we don't support performance
+ * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
+ * NAKed, so it will read the PMCR anyway.
+ *
+ * Therefore we tell the guest we have 0 counters.  Unfortunately, we
+ * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
+ * all PM registers, which doesn't crash the guest kernel at least.
+ *
+ * Same goes for the whole debug infrastructure, which probably breaks
+ * some guest functionnality. This should be fixed.
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -260,10 +262,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMINTENSET_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMINTENCLR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 
 	/* MAIR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000),
@@ -292,43 +294,43 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	/* PMCR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCNTENSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCNTENCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMOVSCLR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b011),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMSWINC_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b100),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMSELR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b101),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCEID0_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b110),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCEID1_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b111),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMCCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMXEVTYPER_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b001),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMXEVCNTR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b010),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMUSERENR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b000),
-	  pm_fake },
+	  trap_wi_raz },
 	/* PMOVSSET_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1110), Op2(0b011),
-	  pm_fake },
+	  trap_wi_raz },
 
 	/* TPIDR_EL0 */
 	{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b0000), Op2(0b010),
@@ -374,19 +376,20 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw },
 	{ Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw },
 
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), pm_fake },
-	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), pm_fake },
+	/* PMU */
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 2), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 3), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 5), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 6), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(12), Op2( 7), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(13), Op2( 2), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 0), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 1), trap_wi_raz },
+	{ Op1( 0), CRn( 9), CRm(14), Op2( 2), trap_wi_raz },
 
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 0), access_vm_reg, NULL, c10_PRRR },
 	{ Op1( 0), CRn(10), CRm( 2), Op2( 1), access_vm_reg, NULL, c10_NMRR },
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

In order to be able to use the DBG_MDSCR_* macros from the KVM code,
move the relevant definitions to the obvious include file.

Also move the debug_el enum to a portion of the file that is guarded
by #ifndef __ASSEMBLY__ in order to use that file from assembly code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/debug-monitors.h | 19 ++++++++++++++-----
 arch/arm64/kernel/debug-monitors.c      |  9 ---------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 6e9b5b3..7fb3437 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -18,6 +18,15 @@
 
 #ifdef __KERNEL__
 
+/* Low-level stepping controls. */
+#define DBG_MDSCR_SS		(1 << 0)
+#define DBG_SPSR_SS		(1 << 21)
+
+/* MDSCR_EL1 enabling bits */
+#define DBG_MDSCR_KDE		(1 << 13)
+#define DBG_MDSCR_MDE		(1 << 15)
+#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
+
 #define	DBG_ESR_EVT(x)		(((x) >> 27) & 0x7)
 
 /* AArch64 */
@@ -73,11 +82,6 @@
 
 #define CACHE_FLUSH_IS_SAFE		1
 
-enum debug_el {
-	DBG_ACTIVE_EL0 = 0,
-	DBG_ACTIVE_EL1,
-};
-
 /* AArch32 */
 #define DBG_ESR_EVT_BKPT	0x4
 #define DBG_ESR_EVT_VECC	0x5
@@ -115,6 +119,11 @@ void unregister_break_hook(struct break_hook *hook);
 
 u8 debug_monitors_arch(void);
 
+enum debug_el {
+	DBG_ACTIVE_EL0 = 0,
+	DBG_ACTIVE_EL1,
+};
+
 void enable_debug_monitors(enum debug_el el);
 void disable_debug_monitors(enum debug_el el);
 
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index a7fb874..e022f87 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -30,15 +30,6 @@
 #include <asm/cputype.h>
 #include <asm/system_misc.h>
 
-/* Low-level stepping controls. */
-#define DBG_MDSCR_SS		(1 << 0)
-#define DBG_SPSR_SS		(1 << 21)
-
-/* MDSCR_EL1 enabling bits */
-#define DBG_MDSCR_KDE		(1 << 13)
-#define DBG_MDSCR_MDE		(1 << 15)
-#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
-
 /* Determine debug architecture. */
 u8 debug_monitors_arch(void)
 {
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

In order to be able to use the DBG_MDSCR_* macros from the KVM code,
move the relevant definitions to the obvious include file.

Also move the debug_el enum to a portion of the file that is guarded
by #ifndef __ASSEMBLY__ in order to use that file from assembly code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/debug-monitors.h | 19 ++++++++++++++-----
 arch/arm64/kernel/debug-monitors.c      |  9 ---------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 6e9b5b3..7fb3437 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -18,6 +18,15 @@
 
 #ifdef __KERNEL__
 
+/* Low-level stepping controls. */
+#define DBG_MDSCR_SS		(1 << 0)
+#define DBG_SPSR_SS		(1 << 21)
+
+/* MDSCR_EL1 enabling bits */
+#define DBG_MDSCR_KDE		(1 << 13)
+#define DBG_MDSCR_MDE		(1 << 15)
+#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
+
 #define	DBG_ESR_EVT(x)		(((x) >> 27) & 0x7)
 
 /* AArch64 */
@@ -73,11 +82,6 @@
 
 #define CACHE_FLUSH_IS_SAFE		1
 
-enum debug_el {
-	DBG_ACTIVE_EL0 = 0,
-	DBG_ACTIVE_EL1,
-};
-
 /* AArch32 */
 #define DBG_ESR_EVT_BKPT	0x4
 #define DBG_ESR_EVT_VECC	0x5
@@ -115,6 +119,11 @@ void unregister_break_hook(struct break_hook *hook);
 
 u8 debug_monitors_arch(void);
 
+enum debug_el {
+	DBG_ACTIVE_EL0 = 0,
+	DBG_ACTIVE_EL1,
+};
+
 void enable_debug_monitors(enum debug_el el);
 void disable_debug_monitors(enum debug_el el);
 
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index a7fb874..e022f87 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -30,15 +30,6 @@
 #include <asm/cputype.h>
 #include <asm/system_misc.h>
 
-/* Low-level stepping controls. */
-#define DBG_MDSCR_SS		(1 << 0)
-#define DBG_SPSR_SS		(1 << 21)
-
-/* MDSCR_EL1 enabling bits */
-#define DBG_MDSCR_KDE		(1 << 13)
-#define DBG_MDSCR_MDE		(1 << 15)
-#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
-
 /* Determine debug architecture. */
 u8 debug_monitors_arch(void)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

Add handlers for all the AArch64 debug registers that are accessible
from EL0 or EL1. The trapping code keeps track of the state of the
debug registers, allowing for the switch code to implement a lazy
switching strategy.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
 arch/arm64/include/asm/kvm_host.h |   3 +
 arch/arm64/kvm/sys_regs.c         | 130 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 151 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9fcd54b..e6b159a 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -43,14 +43,25 @@
 #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
 #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
 #define	PAR_EL1		21	/* Physical Address Register */
+#define MDSCR_EL1	22	/* Monitor Debug System Control Register */
+#define DBGBCR0_EL1	23	/* Debug Breakpoint Control Registers (0-15) */
+#define DBGBCR15_EL1	38
+#define DBGBVR0_EL1	39	/* Debug Breakpoint Value Registers (0-15) */
+#define DBGBVR15_EL1	54
+#define DBGWCR0_EL1	55	/* Debug Watchpoint Control Registers (0-15) */
+#define DBGWCR15_EL1	70
+#define DBGWVR0_EL1	71	/* Debug Watchpoint Value Registers (0-15) */
+#define DBGWVR15_EL1	86
+#define MDCCINT_EL1	87	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+
 /* 32bit specific registers. Keep them at the end of the range */
-#define	DACR32_EL2	22	/* Domain Access Control Register */
-#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
-#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
-#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
-#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
-#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
-#define	NR_SYS_REGS	28
+#define	DACR32_EL2	88	/* Domain Access Control Register */
+#define	IFSR32_EL2	89	/* Instruction Fault Status Register */
+#define	FPEXC32_EL2	90	/* Floating-Point Exception Control Register */
+#define	DBGVCR32_EL2	91	/* Debug Vector Catch Register */
+#define	TEECR32_EL1	92	/* ThumbEE Configuration Register */
+#define	TEEHBR32_EL1	93	/* ThumbEE Handler Base Register */
+#define	NR_SYS_REGS	94
 
 /* 32bit mapping */
 #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
@@ -87,6 +98,9 @@
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
 
+#define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
+#define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0a1d697..4737961 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
 	/* Exception Information */
 	struct kvm_vcpu_fault_info fault;
 
+	/* Debug state */
+	u64 debug_flags;
+
 	/* Pointer to host CPU context */
 	kvm_cpu_context_t *host_cpu_context;
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fc8d4e3..618d4fb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -30,6 +30,7 @@
 #include <asm/kvm_mmu.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
+#include <asm/debug-monitors.h>
 #include <trace/events/kvm.h>
 
 #include "sys_regs.h"
@@ -173,6 +174,58 @@ static bool trap_wi_raz(struct kvm_vcpu *vcpu,
 		return read_zero(vcpu, p);
 }
 
+static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
+			   const struct sys_reg_params *p,
+			   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = (1 << 3);
+		return true;
+	}
+}
+
+static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
+				   const struct sys_reg_params *p,
+				   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = 0x2222; /* Implemented and disabled */
+		return true;
+	}
+}
+
+/*
+ * Trap handler for DBG[BW][CV]Rn_EL1 and MDSCR_EL1. We track the
+ * "dirtiness" of the registers.
+ */
+static bool trap_debug_regs(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_params *p,
+			    const struct sys_reg_desc *r)
+{
+	/*
+	 * The best thing to do would be to trap MDSCR_EL1
+	 * independently, test if DBG_MDSCR_KDE or DBG_MDSCR_MDE is
+	 * getting set, and only set the DIRTY bit in that case.
+	 *
+	 * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1
+	 * like a woodpecker on a tree, and it is better to disable
+	 * trapping as soon as possible in this case. Some day, make
+	 * this a tuneable...
+	 */
+	if (p->is_write) {
+		vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+	}
+
+	return true;
+}
+
 static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
 	u64 amair;
@@ -189,6 +242,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
 }
 
+/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go*/
+#define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
+	/* DBGBVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),	\
+	  trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },		\
+	/* DBGBCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),	\
+	  trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },		\
+	/* DBGWVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),	\
+	  trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 },		\
+	/* DBGWCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),	\
+	  trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 }
+
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
@@ -200,9 +268,6 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
  * Therefore we tell the guest we have 0 counters.  Unfortunately, we
  * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
  * all PM registers, which doesn't crash the guest kernel at least.
- *
- * Same goes for the whole debug infrastructure, which probably breaks
- * some guest functionnality. This should be fixed.
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -215,12 +280,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
 	  access_dcsw },
 
+	DBG_BCR_BVR_WCR_WVR_EL1(0),
+	DBG_BCR_BVR_WCR_WVR_EL1(1),
+	/* MDCCINT_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
+	  trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
+	/* MDSCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
+	  trap_debug_regs, reset_val, MDSCR_EL1, 0 },
+	DBG_BCR_BVR_WCR_WVR_EL1(2),
+	DBG_BCR_BVR_WCR_WVR_EL1(3),
+	DBG_BCR_BVR_WCR_WVR_EL1(4),
+	DBG_BCR_BVR_WCR_WVR_EL1(5),
+	DBG_BCR_BVR_WCR_WVR_EL1(6),
+	DBG_BCR_BVR_WCR_WVR_EL1(7),
+	DBG_BCR_BVR_WCR_WVR_EL1(8),
+	DBG_BCR_BVR_WCR_WVR_EL1(9),
+	DBG_BCR_BVR_WCR_WVR_EL1(10),
+	DBG_BCR_BVR_WCR_WVR_EL1(11),
+	DBG_BCR_BVR_WCR_WVR_EL1(12),
+	DBG_BCR_BVR_WCR_WVR_EL1(13),
+	DBG_BCR_BVR_WCR_WVR_EL1(14),
+	DBG_BCR_BVR_WCR_WVR_EL1(15),
+
+	/* MDRAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
+	  trap_wi_raz },
+	/* OSLAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
+	  trap_wi_raz },
+	/* OSLSR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
+	  trap_oslsr_el1 },
+	/* OSDLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
+	  trap_wi_raz },
+	/* DBGPRCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
+	  trap_wi_raz },
+	/* DBGCLAIMSET_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
+	  trap_wi_raz },
+	/* DBGCLAIMCLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
+	  trap_wi_raz },
+	/* DBGAUTHSTATUS_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
+	  trap_dbgauthstatus_el1 },
+
 	/* TEECR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEECR32_EL1, 0 },
 	/* TEEHBR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEEHBR32_EL1, 0 },
+
+	/* MDCCSR_EL1 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
+	  trap_wi_raz },
+	/* DBGDTR_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
+	  trap_wi_raz },
+	/* DBGDTR[TR]X_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
+	  trap_wi_raz },
+
 	/* DBGVCR32_EL2 */
 	{ Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
 	  NULL, reset_val, DBGVCR32_EL2, 0 },
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the AArch64 debug registers that are accessible
from EL0 or EL1. The trapping code keeps track of the state of the
debug registers, allowing for the switch code to implement a lazy
switching strategy.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
 arch/arm64/include/asm/kvm_host.h |   3 +
 arch/arm64/kvm/sys_regs.c         | 130 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 151 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9fcd54b..e6b159a 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -43,14 +43,25 @@
 #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
 #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
 #define	PAR_EL1		21	/* Physical Address Register */
+#define MDSCR_EL1	22	/* Monitor Debug System Control Register */
+#define DBGBCR0_EL1	23	/* Debug Breakpoint Control Registers (0-15) */
+#define DBGBCR15_EL1	38
+#define DBGBVR0_EL1	39	/* Debug Breakpoint Value Registers (0-15) */
+#define DBGBVR15_EL1	54
+#define DBGWCR0_EL1	55	/* Debug Watchpoint Control Registers (0-15) */
+#define DBGWCR15_EL1	70
+#define DBGWVR0_EL1	71	/* Debug Watchpoint Value Registers (0-15) */
+#define DBGWVR15_EL1	86
+#define MDCCINT_EL1	87	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+
 /* 32bit specific registers. Keep them at the end of the range */
-#define	DACR32_EL2	22	/* Domain Access Control Register */
-#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
-#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
-#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
-#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
-#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
-#define	NR_SYS_REGS	28
+#define	DACR32_EL2	88	/* Domain Access Control Register */
+#define	IFSR32_EL2	89	/* Instruction Fault Status Register */
+#define	FPEXC32_EL2	90	/* Floating-Point Exception Control Register */
+#define	DBGVCR32_EL2	91	/* Debug Vector Catch Register */
+#define	TEECR32_EL1	92	/* ThumbEE Configuration Register */
+#define	TEEHBR32_EL1	93	/* ThumbEE Handler Base Register */
+#define	NR_SYS_REGS	94
 
 /* 32bit mapping */
 #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
@@ -87,6 +98,9 @@
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
 
+#define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
+#define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0a1d697..4737961 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
 	/* Exception Information */
 	struct kvm_vcpu_fault_info fault;
 
+	/* Debug state */
+	u64 debug_flags;
+
 	/* Pointer to host CPU context */
 	kvm_cpu_context_t *host_cpu_context;
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fc8d4e3..618d4fb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -30,6 +30,7 @@
 #include <asm/kvm_mmu.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
+#include <asm/debug-monitors.h>
 #include <trace/events/kvm.h>
 
 #include "sys_regs.h"
@@ -173,6 +174,58 @@ static bool trap_wi_raz(struct kvm_vcpu *vcpu,
 		return read_zero(vcpu, p);
 }
 
+static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
+			   const struct sys_reg_params *p,
+			   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = (1 << 3);
+		return true;
+	}
+}
+
+static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
+				   const struct sys_reg_params *p,
+				   const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = 0x2222; /* Implemented and disabled */
+		return true;
+	}
+}
+
+/*
+ * Trap handler for DBG[BW][CV]Rn_EL1 and MDSCR_EL1. We track the
+ * "dirtiness" of the registers.
+ */
+static bool trap_debug_regs(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_params *p,
+			    const struct sys_reg_desc *r)
+{
+	/*
+	 * The best thing to do would be to trap MDSCR_EL1
+	 * independently, test if DBG_MDSCR_KDE or DBG_MDSCR_MDE is
+	 * getting set, and only set the DIRTY bit in that case.
+	 *
+	 * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1
+	 * like a woodpecker on a tree, and it is better to disable
+	 * trapping as soon as possible in this case. Some day, make
+	 * this a tuneable...
+	 */
+	if (p->is_write) {
+		vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+	}
+
+	return true;
+}
+
 static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
 	u64 amair;
@@ -189,6 +242,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
 }
 
+/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go*/
+#define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
+	/* DBGBVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),	\
+	  trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },		\
+	/* DBGBCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),	\
+	  trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },		\
+	/* DBGWVRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),	\
+	  trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 },		\
+	/* DBGWCRn_EL1 */						\
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),	\
+	  trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 }
+
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
@@ -200,9 +268,6 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
  * Therefore we tell the guest we have 0 counters.  Unfortunately, we
  * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
  * all PM registers, which doesn't crash the guest kernel at least.
- *
- * Same goes for the whole debug infrastructure, which probably breaks
- * some guest functionnality. This should be fixed.
  */
 static const struct sys_reg_desc sys_reg_descs[] = {
 	/* DC ISW */
@@ -215,12 +280,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
 	  access_dcsw },
 
+	DBG_BCR_BVR_WCR_WVR_EL1(0),
+	DBG_BCR_BVR_WCR_WVR_EL1(1),
+	/* MDCCINT_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
+	  trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
+	/* MDSCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
+	  trap_debug_regs, reset_val, MDSCR_EL1, 0 },
+	DBG_BCR_BVR_WCR_WVR_EL1(2),
+	DBG_BCR_BVR_WCR_WVR_EL1(3),
+	DBG_BCR_BVR_WCR_WVR_EL1(4),
+	DBG_BCR_BVR_WCR_WVR_EL1(5),
+	DBG_BCR_BVR_WCR_WVR_EL1(6),
+	DBG_BCR_BVR_WCR_WVR_EL1(7),
+	DBG_BCR_BVR_WCR_WVR_EL1(8),
+	DBG_BCR_BVR_WCR_WVR_EL1(9),
+	DBG_BCR_BVR_WCR_WVR_EL1(10),
+	DBG_BCR_BVR_WCR_WVR_EL1(11),
+	DBG_BCR_BVR_WCR_WVR_EL1(12),
+	DBG_BCR_BVR_WCR_WVR_EL1(13),
+	DBG_BCR_BVR_WCR_WVR_EL1(14),
+	DBG_BCR_BVR_WCR_WVR_EL1(15),
+
+	/* MDRAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
+	  trap_wi_raz },
+	/* OSLAR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
+	  trap_wi_raz },
+	/* OSLSR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
+	  trap_oslsr_el1 },
+	/* OSDLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
+	  trap_wi_raz },
+	/* DBGPRCR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
+	  trap_wi_raz },
+	/* DBGCLAIMSET_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
+	  trap_wi_raz },
+	/* DBGCLAIMCLR_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
+	  trap_wi_raz },
+	/* DBGAUTHSTATUS_EL1 */
+	{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
+	  trap_dbgauthstatus_el1 },
+
 	/* TEECR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEECR32_EL1, 0 },
 	/* TEEHBR32_EL1 */
 	{ Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, TEEHBR32_EL1, 0 },
+
+	/* MDCCSR_EL1 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
+	  trap_wi_raz },
+	/* DBGDTR_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
+	  trap_wi_raz },
+	/* DBGDTR[TR]X_EL0 */
+	{ Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
+	  trap_wi_raz },
+
 	/* DBGVCR32_EL2 */
 	{ Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
 	  NULL, reset_val, DBGVCR32_EL2, 0 },
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Catalin Marinas, Will Deacon, Ian Campbell, Christoffer Dall

As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h    |   2 +-
 arch/arm64/include/asm/kvm_coproc.h |   3 +-
 arch/arm64/include/asm/kvm_host.h   |   9 ++-
 arch/arm64/kvm/handle_exit.c        |   4 +-
 arch/arm64/kvm/sys_regs.c           | 121 +++++++++++++++++++++++++++++-------
 5 files changed, 111 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e6b159a..12f9dd7 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,7 +93,7 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
-#define NR_CP15_REGS	(NR_SYS_REGS * 2)
+#define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
diff --git a/arch/arm64/include/asm/kvm_coproc.h b/arch/arm64/include/asm/kvm_coproc.h
index 9a59301..0b52377 100644
--- a/arch/arm64/include/asm/kvm_coproc.h
+++ b/arch/arm64/include/asm/kvm_coproc.h
@@ -39,7 +39,8 @@ void kvm_register_target_sys_reg_table(unsigned int target,
 				       struct kvm_sys_reg_target_table *table);
 
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4737961..31cff7a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -86,7 +86,7 @@ struct kvm_cpu_context {
 	struct kvm_regs	gp_regs;
 	union {
 		u64 sys_regs[NR_SYS_REGS];
-		u32 cp15[NR_CP15_REGS];
+		u32 copro[NR_COPRO_REGS];
 	};
 };
 
@@ -141,7 +141,12 @@ struct kvm_vcpu_arch {
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 #define vcpu_sys_reg(v,r)	((v)->arch.ctxt.sys_regs[(r)])
-#define vcpu_cp15(v,r)		((v)->arch.ctxt.cp15[(r)])
+/*
+ * CP14 and CP15 live in the same array, as they are backed by the
+ * same system registers.
+ */
+#define vcpu_cp14(v,r)		((v)->arch.ctxt.copro[(r)])
+#define vcpu_cp15(v,r)		((v)->arch.ctxt.copro[(r)])
 
 struct kvm_vm_stat {
 	u32 remote_tlb_flush;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7bc41ea..f0ca49f 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -69,9 +69,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
 	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
 	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
-	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[ESR_EL2_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_64,
 	[ESR_EL2_EC_HVC32]	= handle_hvc,
 	[ESR_EL2_EC_SMC32]	= handle_smc,
 	[ESR_EL2_EC_HVC64]	= handle_hvc,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 618d4fb..feafd8d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -474,6 +474,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+/* Trapped cp14 registers */
+static const struct sys_reg_desc cp14_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
@@ -581,26 +585,19 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
-static void emulate_cp15(struct kvm_vcpu *vcpu,
-			 const struct sys_reg_params *params)
+static int emulate_cp(struct kvm_vcpu *vcpu,
+		      const struct sys_reg_params *params,
+		      const struct sys_reg_desc *table,
+		      size_t num)
 {
-	size_t num;
-	const struct sys_reg_desc *table, *r;
+	const struct sys_reg_desc *r;
 
-	table = get_target_table(vcpu->arch.target, false, &num);
+	if (!table)
+		return -1;	/* Not handled */
 
-	/* Search target-specific then generic table. */
 	r = find_reg(params, table, num);
-	if (!r)
-		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
 
-	if (likely(r)) {
+	if (r) {
 		/*
 		 * Not having an accessor means that we have
 		 * configured a trap that we don't know how to
@@ -612,12 +609,37 @@ static void emulate_cp15(struct kvm_vcpu *vcpu,
 		if (likely(r->access(vcpu, params, r))) {
 			/* Skip instruction, since it was emulated */
 			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-			return;
 		}
-		/* If access function fails, it should complain. */
+
+		/* Handled */
+		return 0;
 	}
 
-	kvm_err("Unsupported guest CP15 access at: %08lx\n", *vcpu_pc(vcpu));
+	/* Not handled */
+	return -1;
+}
+
+static void unhandled_cp_access(struct kvm_vcpu *vcpu,
+				struct sys_reg_params *params)
+{
+	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+	int cp;
+
+	switch(hsr_ec) {
+	case ESR_EL2_EC_CP15_32:
+	case ESR_EL2_EC_CP15_64:
+		cp = 15;
+		break;
+	case ESR_EL2_EC_CP14_MR:
+	case ESR_EL2_EC_CP14_64:
+		cp = 14;
+		break;
+	default:
+		WARN_ON((cp = -1));
+	}
+
+	kvm_err("Unsupported guest CP%d access at: %08lx\n",
+		cp, *vcpu_pc(vcpu));
 	print_sys_reg_instr(params);
 	kvm_inject_undefined(vcpu);
 }
@@ -627,7 +649,11 @@ static void emulate_cp15(struct kvm_vcpu *vcpu,
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -656,8 +682,14 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		*vcpu_reg(vcpu, params.Rt) = val;
 	}
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		goto out;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		goto out;
+
+	unhandled_cp_access(vcpu, &params);
 
+out:
 	/* Do the opposite hack for the read side */
 	if (!params.is_write) {
 		u64 val = *vcpu_reg(vcpu, params.Rt);
@@ -673,7 +705,11 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -688,10 +724,51 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op1 = (hsr >> 14) & 0x7;
 	params.Op2 = (hsr >> 17) & 0x7;
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
 	return 1;
 }
 
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_64(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_32(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_64(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_32(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
 static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 			   const struct sys_reg_params *params)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h    |   2 +-
 arch/arm64/include/asm/kvm_coproc.h |   3 +-
 arch/arm64/include/asm/kvm_host.h   |   9 ++-
 arch/arm64/kvm/handle_exit.c        |   4 +-
 arch/arm64/kvm/sys_regs.c           | 121 +++++++++++++++++++++++++++++-------
 5 files changed, 111 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e6b159a..12f9dd7 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,7 +93,7 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
-#define NR_CP15_REGS	(NR_SYS_REGS * 2)
+#define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
 #define ARM_EXCEPTION_TRAP	  1
diff --git a/arch/arm64/include/asm/kvm_coproc.h b/arch/arm64/include/asm/kvm_coproc.h
index 9a59301..0b52377 100644
--- a/arch/arm64/include/asm/kvm_coproc.h
+++ b/arch/arm64/include/asm/kvm_coproc.h
@@ -39,7 +39,8 @@ void kvm_register_target_sys_reg_table(unsigned int target,
 				       struct kvm_sys_reg_target_table *table);
 
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4737961..31cff7a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -86,7 +86,7 @@ struct kvm_cpu_context {
 	struct kvm_regs	gp_regs;
 	union {
 		u64 sys_regs[NR_SYS_REGS];
-		u32 cp15[NR_CP15_REGS];
+		u32 copro[NR_COPRO_REGS];
 	};
 };
 
@@ -141,7 +141,12 @@ struct kvm_vcpu_arch {
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 #define vcpu_sys_reg(v,r)	((v)->arch.ctxt.sys_regs[(r)])
-#define vcpu_cp15(v,r)		((v)->arch.ctxt.cp15[(r)])
+/*
+ * CP14 and CP15 live in the same array, as they are backed by the
+ * same system registers.
+ */
+#define vcpu_cp14(v,r)		((v)->arch.ctxt.copro[(r)])
+#define vcpu_cp15(v,r)		((v)->arch.ctxt.copro[(r)])
 
 struct kvm_vm_stat {
 	u32 remote_tlb_flush;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7bc41ea..f0ca49f 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -69,9 +69,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
 	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
 	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
-	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[ESR_EL2_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_access,
+	[ESR_EL2_EC_CP14_64]	= kvm_handle_cp14_64,
 	[ESR_EL2_EC_HVC32]	= handle_hvc,
 	[ESR_EL2_EC_SMC32]	= handle_smc,
 	[ESR_EL2_EC_HVC64]	= handle_hvc,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 618d4fb..feafd8d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -474,6 +474,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+/* Trapped cp14 registers */
+static const struct sys_reg_desc cp14_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
@@ -581,26 +585,19 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
-static void emulate_cp15(struct kvm_vcpu *vcpu,
-			 const struct sys_reg_params *params)
+static int emulate_cp(struct kvm_vcpu *vcpu,
+		      const struct sys_reg_params *params,
+		      const struct sys_reg_desc *table,
+		      size_t num)
 {
-	size_t num;
-	const struct sys_reg_desc *table, *r;
+	const struct sys_reg_desc *r;
 
-	table = get_target_table(vcpu->arch.target, false, &num);
+	if (!table)
+		return -1;	/* Not handled */
 
-	/* Search target-specific then generic table. */
 	r = find_reg(params, table, num);
-	if (!r)
-		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
 
-	if (likely(r)) {
+	if (r) {
 		/*
 		 * Not having an accessor means that we have
 		 * configured a trap that we don't know how to
@@ -612,12 +609,37 @@ static void emulate_cp15(struct kvm_vcpu *vcpu,
 		if (likely(r->access(vcpu, params, r))) {
 			/* Skip instruction, since it was emulated */
 			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-			return;
 		}
-		/* If access function fails, it should complain. */
+
+		/* Handled */
+		return 0;
 	}
 
-	kvm_err("Unsupported guest CP15 access at: %08lx\n", *vcpu_pc(vcpu));
+	/* Not handled */
+	return -1;
+}
+
+static void unhandled_cp_access(struct kvm_vcpu *vcpu,
+				struct sys_reg_params *params)
+{
+	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+	int cp;
+
+	switch(hsr_ec) {
+	case ESR_EL2_EC_CP15_32:
+	case ESR_EL2_EC_CP15_64:
+		cp = 15;
+		break;
+	case ESR_EL2_EC_CP14_MR:
+	case ESR_EL2_EC_CP14_64:
+		cp = 14;
+		break;
+	default:
+		WARN_ON((cp = -1));
+	}
+
+	kvm_err("Unsupported guest CP%d access at: %08lx\n",
+		cp, *vcpu_pc(vcpu));
 	print_sys_reg_instr(params);
 	kvm_inject_undefined(vcpu);
 }
@@ -627,7 +649,11 @@ static void emulate_cp15(struct kvm_vcpu *vcpu,
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -656,8 +682,14 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		*vcpu_reg(vcpu, params.Rt) = val;
 	}
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		goto out;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		goto out;
+
+	unhandled_cp_access(vcpu, &params);
 
+out:
 	/* Do the opposite hack for the read side */
 	if (!params.is_write) {
 		u64 val = *vcpu_reg(vcpu, params.Rt);
@@ -673,7 +705,11 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *global,
+			    size_t nr_global,
+			    const struct sys_reg_desc *target_specific,
+			    size_t nr_specific)
 {
 	struct sys_reg_params params;
 	u32 hsr = kvm_vcpu_get_hsr(vcpu);
@@ -688,10 +724,51 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op1 = (hsr >> 14) & 0x7;
 	params.Op2 = (hsr >> 17) & 0x7;
 
-	emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
 	return 1;
 }
 
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_64(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct sys_reg_desc *target_specific;
+	size_t num;
+
+	target_specific = get_target_table(vcpu->arch.target, false, &num);
+	return kvm_handle_cp_32(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
+
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_64(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_32(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
+}
+
 static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 			   const struct sys_reg_params *params)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

An interesting "feature" of the CP14 encoding is that there is
an overlap between 32 and 64bit registers, meaning they cannot
live in the same table as we did for CP15.

Create separate tables for 64bit CP14 and CP15 registers, and
let the top level handler use the right one.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index feafd8d..91ca0e4 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -478,13 +478,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 static const struct sys_reg_desc cp14_regs[] = {
 };
 
+/* Trapped cp14 64bit registers */
+static const struct sys_reg_desc cp14_64_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
  * register).
  */
 static const struct sys_reg_desc cp15_regs[] = {
-	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
@@ -525,6 +528,10 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
 
+};
+
+static const struct sys_reg_desc cp15_64_regs[] = {
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
 };
 
@@ -740,7 +747,7 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 	target_specific = get_target_table(vcpu->arch.target, false, &num);
 	return kvm_handle_cp_64(vcpu,
-				cp15_regs, ARRAY_SIZE(cp15_regs),
+				cp15_64_regs, ARRAY_SIZE(cp15_64_regs),
 				target_specific, num);
 }
 
@@ -758,7 +765,7 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	return kvm_handle_cp_64(vcpu,
-				cp14_regs, ARRAY_SIZE(cp14_regs),
+				cp14_64_regs, ARRAY_SIZE(cp14_64_regs),
 				NULL, 0);
 }
 
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

An interesting "feature" of the CP14 encoding is that there is
an overlap between 32 and 64bit registers, meaning they cannot
live in the same table as we did for CP15.

Create separate tables for 64bit CP14 and CP15 registers, and
let the top level handler use the right one.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index feafd8d..91ca0e4 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -478,13 +478,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 static const struct sys_reg_desc cp14_regs[] = {
 };
 
+/* Trapped cp14 64bit registers */
+static const struct sys_reg_desc cp14_64_regs[] = {
+};
+
 /*
  * Trapped cp15 registers. TTBR0/TTBR1 get a double encoding,
  * depending on the way they are accessed (as a 32bit or a 64bit
  * register).
  */
 static const struct sys_reg_desc cp15_regs[] = {
-	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
@@ -525,6 +528,10 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
 
+};
+
+static const struct sys_reg_desc cp15_64_regs[] = {
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
 };
 
@@ -740,7 +747,7 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 	target_specific = get_target_table(vcpu->arch.target, false, &num);
 	return kvm_handle_cp_64(vcpu,
-				cp15_regs, ARRAY_SIZE(cp15_regs),
+				cp15_64_regs, ARRAY_SIZE(cp15_64_regs),
 				target_specific, num);
 }
 
@@ -758,7 +765,7 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	return kvm_handle_cp_64(vcpu,
-				cp14_regs, ARRAY_SIZE(cp14_regs),
+				cp14_64_regs, ARRAY_SIZE(cp14_64_regs),
 				NULL, 0);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 6/9] arm64: KVM: check ordering of all system register tables
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 91ca0e4..c27a5cd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1280,14 +1280,32 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
+static int check_sysreg_table(const struct sys_reg_desc *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_sys_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("sys_reg table %p out of order (%d)\n", table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void kvm_sys_reg_table_init(void)
 {
 	unsigned int i;
 	struct sys_reg_desc clidr;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(sys_reg_descs); i++)
-		BUG_ON(cmp_sys_reg(&sys_reg_descs[i-1], &sys_reg_descs[i]) >= 0);
+	BUG_ON(check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs)));
+	BUG_ON(check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs)));
+	BUG_ON(check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs)));
+	BUG_ON(check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs)));
+	BUG_ON(check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_sys_regs); i++)
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 6/9] arm64: KVM: check ordering of all system register tables
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 91ca0e4..c27a5cd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1280,14 +1280,32 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
+static int check_sysreg_table(const struct sys_reg_desc *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_sys_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("sys_reg table %p out of order (%d)\n", table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void kvm_sys_reg_table_init(void)
 {
 	unsigned int i;
 	struct sys_reg_desc clidr;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(sys_reg_descs); i++)
-		BUG_ON(cmp_sys_reg(&sys_reg_descs[i-1], &sys_reg_descs[i]) >= 0);
+	BUG_ON(check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs)));
+	BUG_ON(check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs)));
+	BUG_ON(check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs)));
+	BUG_ON(check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs)));
+	BUG_ON(check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_sys_regs); i++)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

Add handlers for all the AArch32 debug registers that are accessible
from EL0 or EL1. The code follow the same strategy as the AArch64
counterpart with regards to tracking the dirty state of the debug
registers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h |   9 +++
 arch/arm64/kvm/sys_regs.c        | 137 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 12f9dd7..993a7db 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,6 +93,15 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
+
+#define cp14_DBGDSCRext	(MDSCR_EL1 * 2)
+#define cp14_DBGBCR0	(DBGBCR0_EL1 * 2)
+#define cp14_DBGBVR0	(DBGBVR0_EL1 * 2)
+#define cp14_DBGBXVR0	(cp14_DBGBVR0 + 1)
+#define cp14_DBGWCR0	(DBGWCR0_EL1 * 2)
+#define cp14_DBGWVR0	(DBGWVR0_EL1 * 2)
+#define cp14_DBGDCCINT	(MDCCINT_EL1 * 2)
+
 #define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c27a5cd..4861fe4 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -474,12 +474,148 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+static bool trap_dbgidr(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
+		u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+		u32 el3 = !!((pfr >> 12) & 0xf);
+
+		*vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) |
+					  (((dfr >> 12) & 0xf) << 24) |
+					  (((dfr >> 28) & 0xf) << 20) |
+					  (6 << 16) | (el3 << 14) | (el3 << 12));
+		return true;
+	}
+}
+
+static bool trap_debug32(struct kvm_vcpu *vcpu,
+			 const struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+	}
+
+	return true;
+}
+
+#define DBG_BCR_BVR_WCR_WVR(n)					\
+	/* DBGBCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 4), trap_debug32,	\
+	  NULL, (cp14_DBGBCR0 + (n) * 2) },			\
+	/* DBGBVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 5), trap_debug32,	\
+	  NULL, (cp14_DBGBVR0 + (n) * 2) },			\
+	/* DBGWVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 6), trap_debug32,	\
+	  NULL, (cp14_DBGWVR0 + (n) * 2) },			\
+	/* DBGWCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 7), trap_debug32,	\
+	  NULL, (cp14_DBGWCR0 + (n) * 2) }
+
+#define DBGBXVR(n)						\
+	{ Op1( 0), CRn( 1), CRm((n)), Op2( 1), trap_debug32,	\
+	  NULL, cp14_DBGBXVR0 + n * 2 }
+
 /* Trapped cp14 registers */
 static const struct sys_reg_desc cp14_regs[] = {
+	/* DBGIDR */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 0), trap_dbgidr },
+	/* DBGDTRRXext */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 2), trap_wi_raz },
+
+	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDSCRint */
+	{ Op1( 0), CRn( 0), CRm( 1), Op2( 0), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDCCINT */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32 },
+	/* DBGDSCRext */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(2),
+	/* DBGDTRTXext */
+	{ Op1( 0), CRn( 0), CRm( 3), Op2( 2), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(3),
+	DBG_BCR_BVR_WCR_WVR(4),
+	DBG_BCR_BVR_WCR_WVR(5),
+	/* DBGWFAR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 0), trap_wi_raz },
+	/* DBGOSECCR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 2), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(6),
+	/* DBGVCR */
+	{ Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(7),
+	DBG_BCR_BVR_WCR_WVR(8),
+	DBG_BCR_BVR_WCR_WVR(9),
+	DBG_BCR_BVR_WCR_WVR(10),
+	DBG_BCR_BVR_WCR_WVR(11),
+	DBG_BCR_BVR_WCR_WVR(12),
+	DBG_BCR_BVR_WCR_WVR(13),
+	DBG_BCR_BVR_WCR_WVR(14),
+	DBG_BCR_BVR_WCR_WVR(15),
+
+	/* DBGDRAR (32bit) */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), trap_wi_raz },
+
+	DBGBXVR(0),
+	/* DBGOSLAR */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_wi_raz },
+	DBGBXVR(1),
+	/* DBGOSLSR */
+	{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1 },
+	DBGBXVR(2),
+	DBGBXVR(3),
+	/* DBGOSDLR */
+	{ Op1( 0), CRn( 1), CRm( 3), Op2( 4), trap_wi_raz },
+	DBGBXVR(4),
+	/* DBGPRCR */
+	{ Op1( 0), CRn( 1), CRm( 4), Op2( 4), trap_wi_raz },
+	DBGBXVR(5),
+	DBGBXVR(6),
+	DBGBXVR(7),
+	DBGBXVR(8),
+	DBGBXVR(9),
+	DBGBXVR(10),
+	DBGBXVR(11),
+	DBGBXVR(12),
+	DBGBXVR(13),
+	DBGBXVR(14),
+	DBGBXVR(15),
+
+	/* DBGDSAR (32bit) */
+	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), trap_wi_raz },
+
+	/* DBGDEVID2 */
+	{ Op1( 0), CRn( 7), CRm( 0), Op2( 7), trap_wi_raz },
+	/* DBGDEVID1 */
+	{ Op1( 0), CRn( 7), CRm( 1), Op2( 7), trap_wi_raz },
+	/* DBGDEVID */
+	{ Op1( 0), CRn( 7), CRm( 2), Op2( 7), trap_wi_raz },
+	/* DBGCLAIMSET */
+	{ Op1( 0), CRn( 7), CRm( 8), Op2( 6), trap_wi_raz },
+	/* DBGCLAIMCLR */
+	{ Op1( 0), CRn( 7), CRm( 9), Op2( 6), trap_wi_raz },
+
+	/* DBGAUTHSTATUS */
+	{ Op1( 0), CRn( 7), CRm(14), Op2( 6), trap_dbgauthstatus_el1 },
 };
 
 /* Trapped cp14 64bit registers */
 static const struct sys_reg_desc cp14_64_regs[] = {
+	/* DBGDRAR (64bit) */
+	{ Op1( 0), CRm( 1), .access = trap_wi_raz },
+
+	/* DBGDSAR (64bit) */
+	{ Op1( 0), CRm( 2), .access = trap_wi_raz },
 };
 
 /*
@@ -527,7 +663,6 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 0), access_vm_reg, NULL, c10_AMAIR0 },
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
-
 };
 
 static const struct sys_reg_desc cp15_64_regs[] = {
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the AArch32 debug registers that are accessible
from EL0 or EL1. The code follow the same strategy as the AArch64
counterpart with regards to tracking the dirty state of the debug
registers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h |   9 +++
 arch/arm64/kvm/sys_regs.c        | 137 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 12f9dd7..993a7db 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -93,6 +93,15 @@
 #define c10_AMAIR0	(AMAIR_EL1 * 2)	/* Aux Memory Attr Indirection Reg */
 #define c10_AMAIR1	(c10_AMAIR0 + 1)/* Aux Memory Attr Indirection Reg */
 #define c14_CNTKCTL	(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
+
+#define cp14_DBGDSCRext	(MDSCR_EL1 * 2)
+#define cp14_DBGBCR0	(DBGBCR0_EL1 * 2)
+#define cp14_DBGBVR0	(DBGBVR0_EL1 * 2)
+#define cp14_DBGBXVR0	(cp14_DBGBVR0 + 1)
+#define cp14_DBGWCR0	(DBGWCR0_EL1 * 2)
+#define cp14_DBGWVR0	(DBGWVR0_EL1 * 2)
+#define cp14_DBGDCCINT	(MDCCINT_EL1 * 2)
+
 #define NR_COPRO_REGS	(NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ	  0
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c27a5cd..4861fe4 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -474,12 +474,148 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+static bool trap_dbgidr(struct kvm_vcpu *vcpu,
+			const struct sys_reg_params *p,
+			const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		return ignore_write(vcpu, p);
+	} else {
+		u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
+		u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+		u32 el3 = !!((pfr >> 12) & 0xf);
+
+		*vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) |
+					  (((dfr >> 12) & 0xf) << 24) |
+					  (((dfr >> 28) & 0xf) << 20) |
+					  (6 << 16) | (el3 << 14) | (el3 << 12));
+		return true;
+	}
+}
+
+static bool trap_debug32(struct kvm_vcpu *vcpu,
+			 const struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	if (p->is_write) {
+		vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	} else {
+		*vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+	}
+
+	return true;
+}
+
+#define DBG_BCR_BVR_WCR_WVR(n)					\
+	/* DBGBCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 4), trap_debug32,	\
+	  NULL, (cp14_DBGBCR0 + (n) * 2) },			\
+	/* DBGBVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 5), trap_debug32,	\
+	  NULL, (cp14_DBGBVR0 + (n) * 2) },			\
+	/* DBGWVRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 6), trap_debug32,	\
+	  NULL, (cp14_DBGWVR0 + (n) * 2) },			\
+	/* DBGWCRn */						\
+	{ Op1( 0), CRn( 0), CRm((n)), Op2( 7), trap_debug32,	\
+	  NULL, (cp14_DBGWCR0 + (n) * 2) }
+
+#define DBGBXVR(n)						\
+	{ Op1( 0), CRn( 1), CRm((n)), Op2( 1), trap_debug32,	\
+	  NULL, cp14_DBGBXVR0 + n * 2 }
+
 /* Trapped cp14 registers */
 static const struct sys_reg_desc cp14_regs[] = {
+	/* DBGIDR */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 0), trap_dbgidr },
+	/* DBGDTRRXext */
+	{ Op1( 0), CRn( 0), CRm( 0), Op2( 2), trap_wi_raz },
+
+	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDSCRint */
+	{ Op1( 0), CRn( 0), CRm( 1), Op2( 0), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDCCINT */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32 },
+	/* DBGDSCRext */
+	{ Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(2),
+	/* DBGDTRTXext */
+	{ Op1( 0), CRn( 0), CRm( 3), Op2( 2), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(3),
+	DBG_BCR_BVR_WCR_WVR(4),
+	DBG_BCR_BVR_WCR_WVR(5),
+	/* DBGWFAR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 0), trap_wi_raz },
+	/* DBGOSECCR */
+	{ Op1( 0), CRn( 0), CRm( 6), Op2( 2), trap_wi_raz },
+	DBG_BCR_BVR_WCR_WVR(6),
+	/* DBGVCR */
+	{ Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(7),
+	DBG_BCR_BVR_WCR_WVR(8),
+	DBG_BCR_BVR_WCR_WVR(9),
+	DBG_BCR_BVR_WCR_WVR(10),
+	DBG_BCR_BVR_WCR_WVR(11),
+	DBG_BCR_BVR_WCR_WVR(12),
+	DBG_BCR_BVR_WCR_WVR(13),
+	DBG_BCR_BVR_WCR_WVR(14),
+	DBG_BCR_BVR_WCR_WVR(15),
+
+	/* DBGDRAR (32bit) */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), trap_wi_raz },
+
+	DBGBXVR(0),
+	/* DBGOSLAR */
+	{ Op1( 0), CRn( 1), CRm( 0), Op2( 4), trap_wi_raz },
+	DBGBXVR(1),
+	/* DBGOSLSR */
+	{ Op1( 0), CRn( 1), CRm( 1), Op2( 4), trap_oslsr_el1 },
+	DBGBXVR(2),
+	DBGBXVR(3),
+	/* DBGOSDLR */
+	{ Op1( 0), CRn( 1), CRm( 3), Op2( 4), trap_wi_raz },
+	DBGBXVR(4),
+	/* DBGPRCR */
+	{ Op1( 0), CRn( 1), CRm( 4), Op2( 4), trap_wi_raz },
+	DBGBXVR(5),
+	DBGBXVR(6),
+	DBGBXVR(7),
+	DBGBXVR(8),
+	DBGBXVR(9),
+	DBGBXVR(10),
+	DBGBXVR(11),
+	DBGBXVR(12),
+	DBGBXVR(13),
+	DBGBXVR(14),
+	DBGBXVR(15),
+
+	/* DBGDSAR (32bit) */
+	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), trap_wi_raz },
+
+	/* DBGDEVID2 */
+	{ Op1( 0), CRn( 7), CRm( 0), Op2( 7), trap_wi_raz },
+	/* DBGDEVID1 */
+	{ Op1( 0), CRn( 7), CRm( 1), Op2( 7), trap_wi_raz },
+	/* DBGDEVID */
+	{ Op1( 0), CRn( 7), CRm( 2), Op2( 7), trap_wi_raz },
+	/* DBGCLAIMSET */
+	{ Op1( 0), CRn( 7), CRm( 8), Op2( 6), trap_wi_raz },
+	/* DBGCLAIMCLR */
+	{ Op1( 0), CRn( 7), CRm( 9), Op2( 6), trap_wi_raz },
+
+	/* DBGAUTHSTATUS */
+	{ Op1( 0), CRn( 7), CRm(14), Op2( 6), trap_dbgauthstatus_el1 },
 };
 
 /* Trapped cp14 64bit registers */
 static const struct sys_reg_desc cp14_64_regs[] = {
+	/* DBGDRAR (64bit) */
+	{ Op1( 0), CRm( 1), .access = trap_wi_raz },
+
+	/* DBGDSAR (64bit) */
+	{ Op1( 0), CRm( 2), .access = trap_wi_raz },
 };
 
 /*
@@ -527,7 +663,6 @@ static const struct sys_reg_desc cp15_regs[] = {
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 0), access_vm_reg, NULL, c10_AMAIR0 },
 	{ Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 },
 	{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
-
 };
 
 static const struct sys_reg_desc cp15_64_regs[] = {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A57 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").

Also, we only save/restore them when MDSCR_EL1 has debug enabled,
or when we've flagged the debug registers as dirty. It means that
most of the time, we only save/restore MDSCR_EL1.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c |   1 +
 arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 444 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 646f888..ae73a83 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,6 +120,7 @@ int main(void)
   DEFINE(VCPU_ESR_EL2,		offsetof(struct kvm_vcpu, arch.fault.esr_el2));
   DEFINE(VCPU_FAR_EL2,		offsetof(struct kvm_vcpu, arch.fault.far_el2));
   DEFINE(VCPU_HPFAR_EL2,	offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
+  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
   DEFINE(VCPU_HCR_EL2,		offsetof(struct kvm_vcpu, arch.hcr_el2));
   DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 2c56012..f9d5a1d 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -21,6 +21,7 @@
 #include <asm/assembler.h>
 #include <asm/memory.h>
 #include <asm/asm-offsets.h>
+#include <asm/debug-monitors.h>
 #include <asm/fpsimdmacros.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -215,6 +216,7 @@ __kvm_hyp_code_start:
 	mrs	x22, 	amair_el1
 	mrs	x23, 	cntkctl_el1
 	mrs	x24,	par_el1
+	mrs	x25,	mdscr_el1
 
 	stp	x4, x5, [x3]
 	stp	x6, x7, [x3, #16]
@@ -226,7 +228,202 @@ __kvm_hyp_code_start:
 	stp	x18, x19, [x3, #112]
 	stp	x20, x21, [x3, #128]
 	stp	x22, x23, [x3, #144]
-	str	x24, [x3, #160]
+	stp	x24, x25, [x3, #160]
+.endm
+
+.macro save_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbcr15_el1
+	mrs	x19, dbgbcr14_el1
+	mrs	x18, dbgbcr13_el1
+	mrs	x17, dbgbcr12_el1
+	mrs	x16, dbgbcr11_el1
+	mrs	x15, dbgbcr10_el1
+	mrs	x14, dbgbcr9_el1
+	mrs	x13, dbgbcr8_el1
+	mrs	x12, dbgbcr7_el1
+	mrs	x11, dbgbcr6_el1
+	mrs	x10, dbgbcr5_el1
+	mrs	x9, dbgbcr4_el1
+	mrs	x8, dbgbcr3_el1
+	mrs	x7, dbgbcr2_el1
+	mrs	x6, dbgbcr1_el1
+	mrs	x5, dbgbcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbvr15_el1
+	mrs	x19, dbgbvr14_el1
+	mrs	x18, dbgbvr13_el1
+	mrs	x17, dbgbvr12_el1
+	mrs	x16, dbgbvr11_el1
+	mrs	x15, dbgbvr10_el1
+	mrs	x14, dbgbvr9_el1
+	mrs	x13, dbgbvr8_el1
+	mrs	x12, dbgbvr7_el1
+	mrs	x11, dbgbvr6_el1
+	mrs	x10, dbgbvr5_el1
+	mrs	x9, dbgbvr4_el1
+	mrs	x8, dbgbvr3_el1
+	mrs	x7, dbgbvr2_el1
+	mrs	x6, dbgbvr1_el1
+	mrs	x5, dbgbvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwcr15_el1
+	mrs	x19, dbgwcr14_el1
+	mrs	x18, dbgwcr13_el1
+	mrs	x17, dbgwcr12_el1
+	mrs	x16, dbgwcr11_el1
+	mrs	x15, dbgwcr10_el1
+	mrs	x14, dbgwcr9_el1
+	mrs	x13, dbgwcr8_el1
+	mrs	x12, dbgwcr7_el1
+	mrs	x11, dbgwcr6_el1
+	mrs	x10, dbgwcr5_el1
+	mrs	x9, dbgwcr4_el1
+	mrs	x8, dbgwcr3_el1
+	mrs	x7, dbgwcr2_el1
+	mrs	x6, dbgwcr1_el1
+	mrs	x5, dbgwcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwvr15_el1
+	mrs	x19, dbgwvr14_el1
+	mrs	x18, dbgwvr13_el1
+	mrs	x17, dbgwvr12_el1
+	mrs	x16, dbgwvr11_el1
+	mrs	x15, dbgwvr10_el1
+	mrs	x14, dbgwvr9_el1
+	mrs	x13, dbgwvr8_el1
+	mrs	x12, dbgwvr7_el1
+	mrs	x11, dbgwvr6_el1
+	mrs	x10, dbgwvr5_el1
+	mrs	x9, dbgwvr4_el1
+	mrs	x8, dbgwvr3_el1
+	mrs	x7, dbgwvr2_el1
+	mrs	x6, dbgwvr1_el1
+	mrs	x5, dbgwvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	mrs	x21, mdccint_el1
+	str	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
 .endm
 
 .macro restore_sysregs
@@ -245,7 +442,7 @@ __kvm_hyp_code_start:
 	ldp	x18, x19, [x3, #112]
 	ldp	x20, x21, [x3, #128]
 	ldp	x22, x23, [x3, #144]
-	ldr	x24, [x3, #160]
+	ldp	x24, x25, [x3, #160]
 
 	msr	vmpidr_el2,	x4
 	msr	csselr_el1,	x5
@@ -268,6 +465,198 @@ __kvm_hyp_code_start:
 	msr	amair_el1,	x22
 	msr	cntkctl_el1,	x23
 	msr	par_el1,	x24
+	msr	mdscr_el1,	x25
+.endm
+
+.macro restore_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbcr15_el1, x20
+	msr	dbgbcr14_el1, x19
+	msr	dbgbcr13_el1, x18
+	msr	dbgbcr12_el1, x17
+	msr	dbgbcr11_el1, x16
+	msr	dbgbcr10_el1, x15
+	msr	dbgbcr9_el1, x14
+	msr	dbgbcr8_el1, x13
+	msr	dbgbcr7_el1, x12
+	msr	dbgbcr6_el1, x11
+	msr	dbgbcr5_el1, x10
+	msr	dbgbcr4_el1, x9
+	msr	dbgbcr3_el1, x8
+	msr	dbgbcr2_el1, x7
+	msr	dbgbcr1_el1, x6
+	msr	dbgbcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbvr15_el1, x20
+	msr	dbgbvr14_el1, x19
+	msr	dbgbvr13_el1, x18
+	msr	dbgbvr12_el1, x17
+	msr	dbgbvr11_el1, x16
+	msr	dbgbvr10_el1, x15
+	msr	dbgbvr9_el1, x14
+	msr	dbgbvr8_el1, x13
+	msr	dbgbvr7_el1, x12
+	msr	dbgbvr6_el1, x11
+	msr	dbgbvr5_el1, x10
+	msr	dbgbvr4_el1, x9
+	msr	dbgbvr3_el1, x8
+	msr	dbgbvr2_el1, x7
+	msr	dbgbvr1_el1, x6
+	msr	dbgbvr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwcr15_el1, x20
+	msr	dbgwcr14_el1, x19
+	msr	dbgwcr13_el1, x18
+	msr	dbgwcr12_el1, x17
+	msr	dbgwcr11_el1, x16
+	msr	dbgwcr10_el1, x15
+	msr	dbgwcr9_el1, x14
+	msr	dbgwcr8_el1, x13
+	msr	dbgwcr7_el1, x12
+	msr	dbgwcr6_el1, x11
+	msr	dbgwcr5_el1, x10
+	msr	dbgwcr4_el1, x9
+	msr	dbgwcr3_el1, x8
+	msr	dbgwcr2_el1, x7
+	msr	dbgwcr1_el1, x6
+	msr	dbgwcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwvr15_el1, x20
+	msr	dbgwvr14_el1, x19
+	msr	dbgwvr13_el1, x18
+	msr	dbgwvr12_el1, x17
+	msr	dbgwvr11_el1, x16
+	msr	dbgwvr10_el1, x15
+	msr	dbgwvr9_el1, x14
+	msr	dbgwvr8_el1, x13
+	msr	dbgwvr7_el1, x12
+	msr	dbgwvr6_el1, x11
+	msr	dbgwvr5_el1, x10
+	msr	dbgwvr4_el1, x9
+	msr	dbgwvr3_el1, x8
+	msr	dbgwvr2_el1, x7
+	msr	dbgwvr1_el1, x6
+	msr	dbgwvr0_el1, x5
+
+	ldr	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
+	msr	mdccint_el1, x21
 .endm
 
 .macro skip_32bit_state tmp, target
@@ -282,6 +671,11 @@ __kvm_hyp_code_start:
 	tbz	\tmp, #12, \target
 .endm
 
+.macro skip_clean_debug_state tmp, target
+	ldr	\tmp, [x0, #VCPU_DEBUG_FLAGS]
+	tbz	\tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
+.endm
+
 .macro save_guest_32bit_state
 	skip_32bit_state x3, 1f
 
@@ -297,10 +691,13 @@ __kvm_hyp_code_start:
 	mrs	x4, dacr32_el2
 	mrs	x5, ifsr32_el2
 	mrs	x6, fpexc32_el2
-	mrs	x7, dbgvcr32_el2
 	stp	x4, x5, [x3]
-	stp	x6, x7, [x3, #16]
+	str	x6, [x3, #16]
 
+	skip_clean_debug_state x8, 2f
+	mrs	x7, dbgvcr32_el2
+	str	x7, [x3, #24]
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -323,12 +720,15 @@ __kvm_hyp_code_start:
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
 	ldp	x4, x5, [x3]
-	ldp	x6, x7, [x3, #16]
+	ldr	x6, [x3, #16]
 	msr	dacr32_el2, x4
 	msr	ifsr32_el2, x5
 	msr	fpexc32_el2, x6
-	msr	dbgvcr32_el2, x7
 
+	skip_clean_debug_state x8, 2f
+	ldr	x7, [x3, #24]
+	msr	dbgvcr32_el2, x7
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -537,6 +937,14 @@ __restore_sysregs:
 	restore_sysregs
 	ret
 
+__save_debug:
+	save_debug
+	ret
+
+__restore_debug:
+	restore_debug
+	ret
+
 __save_fpsimd:
 	save_fpsimd
 	ret
@@ -568,6 +976,21 @@ ENTRY(__kvm_vcpu_run)
 	bl __save_fpsimd
 	bl __save_sysregs
 
+	// Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
+	// is set, we do a full save/restore cycle and disable trapping.
+	add	x25, x0, #VCPU_CONTEXT
+	ldr	x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
+	and	x26, x25, #DBG_MDSCR_KDE
+	and	x25, x25, #DBG_MDSCR_MDE
+	adds	xzr, x25, x26
+	mov	x26, #KVM_ARM64_DEBUG_DIRTY
+	csel	x25, x26, xzr, ne
+	ldr	x26, [x0, #VCPU_DEBUG_FLAGS]
+	orr	x26, x26, x25
+	cbz	x26, 1f
+	str	x26, [x0, #VCPU_DEBUG_FLAGS]
+	bl	__save_debug
+1:
 	activate_traps
 	activate_vm
 
@@ -579,6 +1002,10 @@ ENTRY(__kvm_vcpu_run)
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_clean_debug_state x3, 1f
+	bl	__restore_debug
+1:
 	restore_guest_32bit_state
 	restore_guest_regs
 
@@ -595,6 +1022,10 @@ __kvm_vcpu_return:
 	save_guest_regs
 	bl __save_fpsimd
 	bl __save_sysregs
+
+	skip_clean_debug_state x3, 1f
+	bl	__save_debug
+1:
 	save_guest_32bit_state
 
 	save_timer_state
@@ -609,6 +1040,12 @@ __kvm_vcpu_return:
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_clean_debug_state x3, 1f
+	// Clear the dirty flag for the next run
+	str	xzr, [x0, #VCPU_DEBUG_FLAGS]
+	bl	__restore_debug
+1:
 	restore_host_regs
 
 	mov	x0, x1
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A57 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").

Also, we only save/restore them when MDSCR_EL1 has debug enabled,
or when we've flagged the debug registers as dirty. It means that
most of the time, we only save/restore MDSCR_EL1.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c |   1 +
 arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 444 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 646f888..ae73a83 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,6 +120,7 @@ int main(void)
   DEFINE(VCPU_ESR_EL2,		offsetof(struct kvm_vcpu, arch.fault.esr_el2));
   DEFINE(VCPU_FAR_EL2,		offsetof(struct kvm_vcpu, arch.fault.far_el2));
   DEFINE(VCPU_HPFAR_EL2,	offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
+  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
   DEFINE(VCPU_HCR_EL2,		offsetof(struct kvm_vcpu, arch.hcr_el2));
   DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 2c56012..f9d5a1d 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -21,6 +21,7 @@
 #include <asm/assembler.h>
 #include <asm/memory.h>
 #include <asm/asm-offsets.h>
+#include <asm/debug-monitors.h>
 #include <asm/fpsimdmacros.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -215,6 +216,7 @@ __kvm_hyp_code_start:
 	mrs	x22, 	amair_el1
 	mrs	x23, 	cntkctl_el1
 	mrs	x24,	par_el1
+	mrs	x25,	mdscr_el1
 
 	stp	x4, x5, [x3]
 	stp	x6, x7, [x3, #16]
@@ -226,7 +228,202 @@ __kvm_hyp_code_start:
 	stp	x18, x19, [x3, #112]
 	stp	x20, x21, [x3, #128]
 	stp	x22, x23, [x3, #144]
-	str	x24, [x3, #160]
+	stp	x24, x25, [x3, #160]
+.endm
+
+.macro save_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbcr15_el1
+	mrs	x19, dbgbcr14_el1
+	mrs	x18, dbgbcr13_el1
+	mrs	x17, dbgbcr12_el1
+	mrs	x16, dbgbcr11_el1
+	mrs	x15, dbgbcr10_el1
+	mrs	x14, dbgbcr9_el1
+	mrs	x13, dbgbcr8_el1
+	mrs	x12, dbgbcr7_el1
+	mrs	x11, dbgbcr6_el1
+	mrs	x10, dbgbcr5_el1
+	mrs	x9, dbgbcr4_el1
+	mrs	x8, dbgbcr3_el1
+	mrs	x7, dbgbcr2_el1
+	mrs	x6, dbgbcr1_el1
+	mrs	x5, dbgbcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgbvr15_el1
+	mrs	x19, dbgbvr14_el1
+	mrs	x18, dbgbvr13_el1
+	mrs	x17, dbgbvr12_el1
+	mrs	x16, dbgbvr11_el1
+	mrs	x15, dbgbvr10_el1
+	mrs	x14, dbgbvr9_el1
+	mrs	x13, dbgbvr8_el1
+	mrs	x12, dbgbvr7_el1
+	mrs	x11, dbgbvr6_el1
+	mrs	x10, dbgbvr5_el1
+	mrs	x9, dbgbvr4_el1
+	mrs	x8, dbgbvr3_el1
+	mrs	x7, dbgbvr2_el1
+	mrs	x6, dbgbvr1_el1
+	mrs	x5, dbgbvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwcr15_el1
+	mrs	x19, dbgwcr14_el1
+	mrs	x18, dbgwcr13_el1
+	mrs	x17, dbgwcr12_el1
+	mrs	x16, dbgwcr11_el1
+	mrs	x15, dbgwcr10_el1
+	mrs	x14, dbgwcr9_el1
+	mrs	x13, dbgwcr8_el1
+	mrs	x12, dbgwcr7_el1
+	mrs	x11, dbgwcr6_el1
+	mrs	x10, dbgwcr5_el1
+	mrs	x9, dbgwcr4_el1
+	mrs	x8, dbgwcr3_el1
+	mrs	x7, dbgwcr2_el1
+	mrs	x6, dbgwcr1_el1
+	mrs	x5, dbgwcr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	mrs	x20, dbgwvr15_el1
+	mrs	x19, dbgwvr14_el1
+	mrs	x18, dbgwvr13_el1
+	mrs	x17, dbgwvr12_el1
+	mrs	x16, dbgwvr11_el1
+	mrs	x15, dbgwvr10_el1
+	mrs	x14, dbgwvr9_el1
+	mrs	x13, dbgwvr8_el1
+	mrs	x12, dbgwvr7_el1
+	mrs	x11, dbgwvr6_el1
+	mrs	x10, dbgwvr5_el1
+	mrs	x9, dbgwvr4_el1
+	mrs	x8, dbgwvr3_el1
+	mrs	x7, dbgwvr2_el1
+	mrs	x6, dbgwvr1_el1
+	mrs	x5, dbgwvr0_el1
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+
+1:
+	str	x20, [x3, #(15 * 8)]
+	str	x19, [x3, #(14 * 8)]
+	str	x18, [x3, #(13 * 8)]
+	str	x17, [x3, #(12 * 8)]
+	str	x16, [x3, #(11 * 8)]
+	str	x15, [x3, #(10 * 8)]
+	str	x14, [x3, #(9 * 8)]
+	str	x13, [x3, #(8 * 8)]
+	str	x12, [x3, #(7 * 8)]
+	str	x11, [x3, #(6 * 8)]
+	str	x10, [x3, #(5 * 8)]
+	str	x9, [x3, #(4 * 8)]
+	str	x8, [x3, #(3 * 8)]
+	str	x7, [x3, #(2 * 8)]
+	str	x6, [x3, #(1 * 8)]
+	str	x5, [x3, #(0 * 8)]
+
+	mrs	x21, mdccint_el1
+	str	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
 .endm
 
 .macro restore_sysregs
@@ -245,7 +442,7 @@ __kvm_hyp_code_start:
 	ldp	x18, x19, [x3, #112]
 	ldp	x20, x21, [x3, #128]
 	ldp	x22, x23, [x3, #144]
-	ldr	x24, [x3, #160]
+	ldp	x24, x25, [x3, #160]
 
 	msr	vmpidr_el2,	x4
 	msr	csselr_el1,	x5
@@ -268,6 +465,198 @@ __kvm_hyp_code_start:
 	msr	amair_el1,	x22
 	msr	cntkctl_el1,	x23
 	msr	par_el1,	x24
+	msr	mdscr_el1,	x25
+.endm
+
+.macro restore_debug
+	// x2: base address for cpu context
+	// x3: tmp register
+
+	mrs	x26, id_aa64dfr0_el1
+	ubfx	x24, x26, #12, #4	// Extract BRPs
+	ubfx	x25, x26, #20, #4	// Extract WRPs
+	mov	w26, #15
+	sub	w24, w26, w24		// How many BPs to skip
+	sub	w25, w26, w25		// How many WPs to skip
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbcr15_el1, x20
+	msr	dbgbcr14_el1, x19
+	msr	dbgbcr13_el1, x18
+	msr	dbgbcr12_el1, x17
+	msr	dbgbcr11_el1, x16
+	msr	dbgbcr10_el1, x15
+	msr	dbgbcr9_el1, x14
+	msr	dbgbcr8_el1, x13
+	msr	dbgbcr7_el1, x12
+	msr	dbgbcr6_el1, x11
+	msr	dbgbcr5_el1, x10
+	msr	dbgbcr4_el1, x9
+	msr	dbgbcr3_el1, x8
+	msr	dbgbcr2_el1, x7
+	msr	dbgbcr1_el1, x6
+	msr	dbgbcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x24, lsl #2
+	br	x26
+1:
+	msr	dbgbvr15_el1, x20
+	msr	dbgbvr14_el1, x19
+	msr	dbgbvr13_el1, x18
+	msr	dbgbvr12_el1, x17
+	msr	dbgbvr11_el1, x16
+	msr	dbgbvr10_el1, x15
+	msr	dbgbvr9_el1, x14
+	msr	dbgbvr8_el1, x13
+	msr	dbgbvr7_el1, x12
+	msr	dbgbvr6_el1, x11
+	msr	dbgbvr5_el1, x10
+	msr	dbgbvr4_el1, x9
+	msr	dbgbvr3_el1, x8
+	msr	dbgbvr2_el1, x7
+	msr	dbgbvr1_el1, x6
+	msr	dbgbvr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwcr15_el1, x20
+	msr	dbgwcr14_el1, x19
+	msr	dbgwcr13_el1, x18
+	msr	dbgwcr12_el1, x17
+	msr	dbgwcr11_el1, x16
+	msr	dbgwcr10_el1, x15
+	msr	dbgwcr9_el1, x14
+	msr	dbgwcr8_el1, x13
+	msr	dbgwcr7_el1, x12
+	msr	dbgwcr6_el1, x11
+	msr	dbgwcr5_el1, x10
+	msr	dbgwcr4_el1, x9
+	msr	dbgwcr3_el1, x8
+	msr	dbgwcr2_el1, x7
+	msr	dbgwcr1_el1, x6
+	msr	dbgwcr0_el1, x5
+
+	add	x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	ldr	x20, [x3, #(15 * 8)]
+	ldr	x19, [x3, #(14 * 8)]
+	ldr	x18, [x3, #(13 * 8)]
+	ldr	x17, [x3, #(12 * 8)]
+	ldr	x16, [x3, #(11 * 8)]
+	ldr	x15, [x3, #(10 * 8)]
+	ldr	x14, [x3, #(9 * 8)]
+	ldr	x13, [x3, #(8 * 8)]
+	ldr	x12, [x3, #(7 * 8)]
+	ldr	x11, [x3, #(6 * 8)]
+	ldr	x10, [x3, #(5 * 8)]
+	ldr	x9, [x3, #(4 * 8)]
+	ldr	x8, [x3, #(3 * 8)]
+	ldr	x7, [x3, #(2 * 8)]
+	ldr	x6, [x3, #(1 * 8)]
+	ldr	x5, [x3, #(0 * 8)]
+
+	adr	x26, 1f
+	add	x26, x26, x25, lsl #2
+	br	x26
+1:
+	msr	dbgwvr15_el1, x20
+	msr	dbgwvr14_el1, x19
+	msr	dbgwvr13_el1, x18
+	msr	dbgwvr12_el1, x17
+	msr	dbgwvr11_el1, x16
+	msr	dbgwvr10_el1, x15
+	msr	dbgwvr9_el1, x14
+	msr	dbgwvr8_el1, x13
+	msr	dbgwvr7_el1, x12
+	msr	dbgwvr6_el1, x11
+	msr	dbgwvr5_el1, x10
+	msr	dbgwvr4_el1, x9
+	msr	dbgwvr3_el1, x8
+	msr	dbgwvr2_el1, x7
+	msr	dbgwvr1_el1, x6
+	msr	dbgwvr0_el1, x5
+
+	ldr	x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
+	msr	mdccint_el1, x21
 .endm
 
 .macro skip_32bit_state tmp, target
@@ -282,6 +671,11 @@ __kvm_hyp_code_start:
 	tbz	\tmp, #12, \target
 .endm
 
+.macro skip_clean_debug_state tmp, target
+	ldr	\tmp, [x0, #VCPU_DEBUG_FLAGS]
+	tbz	\tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
+.endm
+
 .macro save_guest_32bit_state
 	skip_32bit_state x3, 1f
 
@@ -297,10 +691,13 @@ __kvm_hyp_code_start:
 	mrs	x4, dacr32_el2
 	mrs	x5, ifsr32_el2
 	mrs	x6, fpexc32_el2
-	mrs	x7, dbgvcr32_el2
 	stp	x4, x5, [x3]
-	stp	x6, x7, [x3, #16]
+	str	x6, [x3, #16]
 
+	skip_clean_debug_state x8, 2f
+	mrs	x7, dbgvcr32_el2
+	str	x7, [x3, #24]
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -323,12 +720,15 @@ __kvm_hyp_code_start:
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
 	ldp	x4, x5, [x3]
-	ldp	x6, x7, [x3, #16]
+	ldr	x6, [x3, #16]
 	msr	dacr32_el2, x4
 	msr	ifsr32_el2, x5
 	msr	fpexc32_el2, x6
-	msr	dbgvcr32_el2, x7
 
+	skip_clean_debug_state x8, 2f
+	ldr	x7, [x3, #24]
+	msr	dbgvcr32_el2, x7
+2:
 	skip_tee_state x8, 1f
 
 	add	x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
@@ -537,6 +937,14 @@ __restore_sysregs:
 	restore_sysregs
 	ret
 
+__save_debug:
+	save_debug
+	ret
+
+__restore_debug:
+	restore_debug
+	ret
+
 __save_fpsimd:
 	save_fpsimd
 	ret
@@ -568,6 +976,21 @@ ENTRY(__kvm_vcpu_run)
 	bl __save_fpsimd
 	bl __save_sysregs
 
+	// Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
+	// is set, we do a full save/restore cycle and disable trapping.
+	add	x25, x0, #VCPU_CONTEXT
+	ldr	x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
+	and	x26, x25, #DBG_MDSCR_KDE
+	and	x25, x25, #DBG_MDSCR_MDE
+	adds	xzr, x25, x26
+	mov	x26, #KVM_ARM64_DEBUG_DIRTY
+	csel	x25, x26, xzr, ne
+	ldr	x26, [x0, #VCPU_DEBUG_FLAGS]
+	orr	x26, x26, x25
+	cbz	x26, 1f
+	str	x26, [x0, #VCPU_DEBUG_FLAGS]
+	bl	__save_debug
+1:
 	activate_traps
 	activate_vm
 
@@ -579,6 +1002,10 @@ ENTRY(__kvm_vcpu_run)
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_clean_debug_state x3, 1f
+	bl	__restore_debug
+1:
 	restore_guest_32bit_state
 	restore_guest_regs
 
@@ -595,6 +1022,10 @@ __kvm_vcpu_return:
 	save_guest_regs
 	bl __save_fpsimd
 	bl __save_sysregs
+
+	skip_clean_debug_state x3, 1f
+	bl	__save_debug
+1:
 	save_guest_32bit_state
 
 	save_timer_state
@@ -609,6 +1040,12 @@ __kvm_vcpu_return:
 
 	bl __restore_sysregs
 	bl __restore_fpsimd
+
+	skip_clean_debug_state x3, 1f
+	// Clear the dirty flag for the next run
+	str	xzr, [x0, #VCPU_DEBUG_FLAGS]
+	bl	__restore_debug
+1:
 	restore_host_regs
 
 	mov	x0, x1
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 9/9] arm64: KVM: enable trapping of all debug registers
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:20   ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm
  Cc: Christoffer Dall, Will Deacon, Catalin Marinas, Ian Campbell

Enable trapping of the debug registers, preventing the guests to
mess with the host state (and allowing guests to use the debug
infrastructure as well).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp.S | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index f9d5a1d..037d5d9 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -753,6 +753,14 @@ __kvm_hyp_code_start:
 	mrs	x2, mdcr_el2
 	and	x2, x2, #MDCR_EL2_HPMN_MASK
 	orr	x2, x2, #(MDCR_EL2_TPM | MDCR_EL2_TPMCR)
+	orr	x2, x2, #(MDCR_EL2_TDRA | MDCR_EL2_TDOSA)
+
+	// Check for KVM_ARM64_DEBUG_DIRTY, and set to debug to trap
+	// if not dirty.
+	ldr	x3, [x0, #VCPU_DEBUG_FLAGS]
+	tbnz	x3, #KVM_ARM64_DEBUG_DIRTY_SHIFT, 1f
+	orr	x2, x2,  #MDCR_EL2_TDA
+1:
 	msr	mdcr_el2, x2
 .endm
 
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 9/9] arm64: KVM: enable trapping of all debug registers
@ 2014-05-07 15:20   ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Enable trapping of the debug registers, preventing the guests to
mess with the host state (and allowing guests to use the debug
infrastructure as well).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp.S | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index f9d5a1d..037d5d9 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -753,6 +753,14 @@ __kvm_hyp_code_start:
 	mrs	x2, mdcr_el2
 	and	x2, x2, #MDCR_EL2_HPMN_MASK
 	orr	x2, x2, #(MDCR_EL2_TPM | MDCR_EL2_TPMCR)
+	orr	x2, x2, #(MDCR_EL2_TDRA | MDCR_EL2_TDOSA)
+
+	// Check for KVM_ARM64_DEBUG_DIRTY, and set to debug to trap
+	// if not dirty.
+	ldr	x3, [x0, #VCPU_DEBUG_FLAGS]
+	tbnz	x3, #KVM_ARM64_DEBUG_DIRTY_SHIFT, 1f
+	orr	x2, x2,  #MDCR_EL2_TDA
+1:
 	msr	mdcr_el2, x2
 .endm
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-07 15:34     ` Peter Maydell
  -1 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-07 15:34 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, arm-mail-list, kvm-devel, Catalin Marinas, Will Deacon,
	Ian Campbell

On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> pm_fake doesn't quite describe what the handler does (ignoring writes
> and returning 0 for reads).
>
> As we're about to use it (a lot) in a different context, rename it
> with a (admitedly cryptic) name that make sense for all users.

> -/*
> - * We could trap ID_DFR0 and tell the guest we don't support performance
> - * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
> - * NAKed, so it will read the PMCR anyway.
> - *
> - * Therefore we tell the guest we have 0 counters.  Unfortunately, we
> - * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
> - * all PM registers, which doesn't crash the guest kernel at least.
> - */
> -static bool pm_fake(struct kvm_vcpu *vcpu,
> -                   const struct sys_reg_params *p,
> -                   const struct sys_reg_desc *r)
> +static bool trap_wi_raz(struct kvm_vcpu *vcpu,
> +                       const struct sys_reg_params *p,
> +                       const struct sys_reg_desc *r)

The standard term for this is "RAZ/WI", not "WI/RAZ", so
why not "trap_raz_wi" ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
@ 2014-05-07 15:34     ` Peter Maydell
  0 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-07 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> pm_fake doesn't quite describe what the handler does (ignoring writes
> and returning 0 for reads).
>
> As we're about to use it (a lot) in a different context, rename it
> with a (admitedly cryptic) name that make sense for all users.

> -/*
> - * We could trap ID_DFR0 and tell the guest we don't support performance
> - * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
> - * NAKed, so it will read the PMCR anyway.
> - *
> - * Therefore we tell the guest we have 0 counters.  Unfortunately, we
> - * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
> - * all PM registers, which doesn't crash the guest kernel at least.
> - */
> -static bool pm_fake(struct kvm_vcpu *vcpu,
> -                   const struct sys_reg_params *p,
> -                   const struct sys_reg_desc *r)
> +static bool trap_wi_raz(struct kvm_vcpu *vcpu,
> +                       const struct sys_reg_params *p,
> +                       const struct sys_reg_desc *r)

The standard term for this is "RAZ/WI", not "WI/RAZ", so
why not "trap_raz_wi" ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
  2014-05-07 15:34     ` Peter Maydell
@ 2014-05-07 15:42       ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:42 UTC (permalink / raw)
  To: Peter Maydell
  Cc: kvmarm, arm-mail-list, kvm-devel, Catalin Marinas, Will Deacon,
	Ian Campbell

On 07/05/14 16:34, Peter Maydell wrote:
> On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> pm_fake doesn't quite describe what the handler does (ignoring writes
>> and returning 0 for reads).
>>
>> As we're about to use it (a lot) in a different context, rename it
>> with a (admitedly cryptic) name that make sense for all users.
> 
>> -/*
>> - * We could trap ID_DFR0 and tell the guest we don't support performance
>> - * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
>> - * NAKed, so it will read the PMCR anyway.
>> - *
>> - * Therefore we tell the guest we have 0 counters.  Unfortunately, we
>> - * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>> - * all PM registers, which doesn't crash the guest kernel at least.
>> - */
>> -static bool pm_fake(struct kvm_vcpu *vcpu,
>> -                   const struct sys_reg_params *p,
>> -                   const struct sys_reg_desc *r)
>> +static bool trap_wi_raz(struct kvm_vcpu *vcpu,
>> +                       const struct sys_reg_params *p,
>> +                       const struct sys_reg_desc *r)
> 
> The standard term for this is "RAZ/WI", not "WI/RAZ", so
> why not "trap_raz_wi" ?

Good point. I'll update it.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
@ 2014-05-07 15:42       ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/05/14 16:34, Peter Maydell wrote:
> On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> pm_fake doesn't quite describe what the handler does (ignoring writes
>> and returning 0 for reads).
>>
>> As we're about to use it (a lot) in a different context, rename it
>> with a (admitedly cryptic) name that make sense for all users.
> 
>> -/*
>> - * We could trap ID_DFR0 and tell the guest we don't support performance
>> - * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
>> - * NAKed, so it will read the PMCR anyway.
>> - *
>> - * Therefore we tell the guest we have 0 counters.  Unfortunately, we
>> - * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>> - * all PM registers, which doesn't crash the guest kernel at least.
>> - */
>> -static bool pm_fake(struct kvm_vcpu *vcpu,
>> -                   const struct sys_reg_params *p,
>> -                   const struct sys_reg_desc *r)
>> +static bool trap_wi_raz(struct kvm_vcpu *vcpu,
>> +                       const struct sys_reg_params *p,
>> +                       const struct sys_reg_desc *r)
> 
> The standard term for this is "RAZ/WI", not "WI/RAZ", so
> why not "trap_raz_wi" ?

Good point. I'll update it.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-07 15:42   ` Peter Maydell
  -1 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-07 15:42 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, arm-mail-list, kvm-devel, Catalin Marinas, Will Deacon,
	Ian Campbell

On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> This patch series adds debug support, a key feature missing from the
> KVM/arm64 port.
>
> The main idea is to keep track of whether the debug registers are
> "dirty" (changed by the guest) or not. In this case, perform the usual
> save/restore dance, for one run only. It means we only have a penalty
> if a guest is actually using the debug registers.
>
> The huge amount of registers is properly frightening, but CPUs
> actually only implement a subset of them. Also, there is a number of
> registers we don't bother emulating (things having to do with external
> debug and OSlock).

Presumably these registers now appear in the userspace
interface too, yes? Did you check that they all cope with
the "migration reads all register values on the source and then
writes them on the destination in arbitrary order" semantics without
further fiddling? (I have a note that says that at least
OSLAR_EL1/OSLSR_EL1 won't work that way, for instance.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-07 15:42   ` Peter Maydell
  0 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-07 15:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> This patch series adds debug support, a key feature missing from the
> KVM/arm64 port.
>
> The main idea is to keep track of whether the debug registers are
> "dirty" (changed by the guest) or not. In this case, perform the usual
> save/restore dance, for one run only. It means we only have a penalty
> if a guest is actually using the debug registers.
>
> The huge amount of registers is properly frightening, but CPUs
> actually only implement a subset of them. Also, there is a number of
> registers we don't bother emulating (things having to do with external
> debug and OSlock).

Presumably these registers now appear in the userspace
interface too, yes? Did you check that they all cope with
the "migration reads all register values on the source and then
writes them on the destination in arbitrary order" semantics without
further fiddling? (I have a note that says that at least
OSLAR_EL1/OSLSR_EL1 won't work that way, for instance.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-07 15:42   ` Peter Maydell
@ 2014-05-07 15:57     ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:57 UTC (permalink / raw)
  To: Peter Maydell
  Cc: kvmarm, arm-mail-list, kvm-devel, Catalin Marinas, Will Deacon,
	Ian Campbell

On 07/05/14 16:42, Peter Maydell wrote:
> On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> This patch series adds debug support, a key feature missing from the
>> KVM/arm64 port.
>>
>> The main idea is to keep track of whether the debug registers are
>> "dirty" (changed by the guest) or not. In this case, perform the usual
>> save/restore dance, for one run only. It means we only have a penalty
>> if a guest is actually using the debug registers.
>>
>> The huge amount of registers is properly frightening, but CPUs
>> actually only implement a subset of them. Also, there is a number of
>> registers we don't bother emulating (things having to do with external
>> debug and OSlock).
> 
> Presumably these registers now appear in the userspace
> interface too, yes? Did you check that they all cope with
> the "migration reads all register values on the source and then
> writes them on the destination in arbitrary order" semantics without
> further fiddling? (I have a note that says that at least
> OSLAR_EL1/OSLSR_EL1 won't work that way, for instance.)

The only registers that are exported to userspace are MDSCR_EL1,
DBG{BW}{CV}Rn_EL1 and MDCCINT_EL1 (and their 32bit counterparts). They
should be fine being saved/restored in any order, as long as you're not
running the vcpu in between.

The OSL*_EL1 registers are just ignored so far. So far, it is unclear
how they would be supported in a guest, and I don't have any test
environment for them. Should the need arise, support can be added on top
of what we have now.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-07 15:57     ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-07 15:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/05/14 16:42, Peter Maydell wrote:
> On 7 May 2014 16:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> This patch series adds debug support, a key feature missing from the
>> KVM/arm64 port.
>>
>> The main idea is to keep track of whether the debug registers are
>> "dirty" (changed by the guest) or not. In this case, perform the usual
>> save/restore dance, for one run only. It means we only have a penalty
>> if a guest is actually using the debug registers.
>>
>> The huge amount of registers is properly frightening, but CPUs
>> actually only implement a subset of them. Also, there is a number of
>> registers we don't bother emulating (things having to do with external
>> debug and OSlock).
> 
> Presumably these registers now appear in the userspace
> interface too, yes? Did you check that they all cope with
> the "migration reads all register values on the source and then
> writes them on the destination in arbitrary order" semantics without
> further fiddling? (I have a note that says that at least
> OSLAR_EL1/OSLSR_EL1 won't work that way, for instance.)

The only registers that are exported to userspace are MDSCR_EL1,
DBG{BW}{CV}Rn_EL1 and MDCCINT_EL1 (and their 32bit counterparts). They
should be fine being saved/restored in any order, as long as you're not
running the vcpu in between.

The OSL*_EL1 registers are just ignored so far. So far, it is unclear
how they would be supported in a guest, and I don't have any test
environment for them. Should the need arise, support can be added on top
of what we have now.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-07 17:14     ` Will Deacon
  -1 siblings, 0 replies; 60+ messages in thread
From: Will Deacon @ 2014-05-07 17:14 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Christoffer Dall, Catalin Marinas,
	Ian Campbell

On Wed, May 07, 2014 at 04:20:47PM +0100, Marc Zyngier wrote:
> In order to be able to use the DBG_MDSCR_* macros from the KVM code,
> move the relevant definitions to the obvious include file.
> 
> Also move the debug_el enum to a portion of the file that is guarded
> by #ifndef __ASSEMBLY__ in order to use that file from assembly code.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/debug-monitors.h | 19 ++++++++++++++-----
>  arch/arm64/kernel/debug-monitors.c      |  9 ---------
>  2 files changed, 14 insertions(+), 14 deletions(-)

  Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h
@ 2014-05-07 17:14     ` Will Deacon
  0 siblings, 0 replies; 60+ messages in thread
From: Will Deacon @ 2014-05-07 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 07, 2014 at 04:20:47PM +0100, Marc Zyngier wrote:
> In order to be able to use the DBG_MDSCR_* macros from the KVM code,
> move the relevant definitions to the obvious include file.
> 
> Also move the debug_el enum to a portion of the file that is guarded
> by #ifndef __ASSEMBLY__ in order to use that file from assembly code.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/debug-monitors.h | 19 ++++++++++++++-----
>  arch/arm64/kernel/debug-monitors.c      |  9 ---------
>  2 files changed, 14 insertions(+), 14 deletions(-)

  Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:27     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:27 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Add handlers for all the AArch64 debug registers that are accessible
> from EL0 or EL1. The trapping code keeps track of the state of the
> debug registers, allowing for the switch code to implement a lazy
> switching strategy.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
>  arch/arm64/include/asm/kvm_host.h |   3 +
>  arch/arm64/kvm/sys_regs.c         | 130 +++++++++++++++++++++++++++++++++++++-
>  3 files changed, 151 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9fcd54b..e6b159a 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -43,14 +43,25 @@
>  #define        AMAIR_EL1       19      /* Aux Memory Attribute Indirection Register */
>  #define        CNTKCTL_EL1     20      /* Timer Control Register (EL1) */
>  #define        PAR_EL1         21      /* Physical Address Register */
> +#define MDSCR_EL1      22      /* Monitor Debug System Control Register */
> +#define DBGBCR0_EL1    23      /* Debug Breakpoint Control Registers (0-15) */
> +#define DBGBCR15_EL1   38
> +#define DBGBVR0_EL1    39      /* Debug Breakpoint Value Registers (0-15) */
> +#define DBGBVR15_EL1   54
> +#define DBGWCR0_EL1    55      /* Debug Watchpoint Control Registers (0-15) */
> +#define DBGWCR15_EL1   70
> +#define DBGWVR0_EL1    71      /* Debug Watchpoint Value Registers (0-15) */
> +#define DBGWVR15_EL1   86
> +#define MDCCINT_EL1    87      /* Monitor Debug Comms Channel Interrupt Enable Reg */
> +
>  /* 32bit specific registers. Keep them at the end of the range */
> -#define        DACR32_EL2      22      /* Domain Access Control Register */
> -#define        IFSR32_EL2      23      /* Instruction Fault Status Register */
> -#define        FPEXC32_EL2     24      /* Floating-Point Exception Control Register */
> -#define        DBGVCR32_EL2    25      /* Debug Vector Catch Register */
> -#define        TEECR32_EL1     26      /* ThumbEE Configuration Register */
> -#define        TEEHBR32_EL1    27      /* ThumbEE Handler Base Register */
> -#define        NR_SYS_REGS     28
> +#define        DACR32_EL2      88      /* Domain Access Control Register */
> +#define        IFSR32_EL2      89      /* Instruction Fault Status Register */
> +#define        FPEXC32_EL2     90      /* Floating-Point Exception Control Register */
> +#define        DBGVCR32_EL2    91      /* Debug Vector Catch Register */
> +#define        TEECR32_EL1     92      /* ThumbEE Configuration Register */
> +#define        TEEHBR32_EL1    93      /* ThumbEE Handler Base Register */
> +#define        NR_SYS_REGS     94
>
>  /* 32bit mapping */
>  #define c0_MPIDR       (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
> @@ -87,6 +98,9 @@
>  #define ARM_EXCEPTION_IRQ        0
>  #define ARM_EXCEPTION_TRAP       1
>
> +#define KVM_ARM64_DEBUG_DIRTY_SHIFT    0
> +#define KVM_ARM64_DEBUG_DIRTY          (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 0a1d697..4737961 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
>         /* Exception Information */
>         struct kvm_vcpu_fault_info fault;
>
> +       /* Debug state */
> +       u64 debug_flags;
> +
>         /* Pointer to host CPU context */
>         kvm_cpu_context_t *host_cpu_context;
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index fc8d4e3..618d4fb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -30,6 +30,7 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/cacheflush.h>
>  #include <asm/cputype.h>
> +#include <asm/debug-monitors.h>
>  #include <trace/events/kvm.h>
>
>  #include "sys_regs.h"
> @@ -173,6 +174,58 @@ static bool trap_wi_raz(struct kvm_vcpu *vcpu,
>                 return read_zero(vcpu, p);
>  }
>
> +static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
> +                          const struct sys_reg_params *p,
> +                          const struct sys_reg_desc *r)
> +{
> +       if (p->is_write) {
> +               return ignore_write(vcpu, p);
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = (1 << 3);
> +               return true;
> +       }
> +}
> +
> +static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
> +                                  const struct sys_reg_params *p,
> +                                  const struct sys_reg_desc *r)
> +{
> +       if (p->is_write) {
> +               return ignore_write(vcpu, p);
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = 0x2222; /* Implemented and disabled */
> +               return true;
> +       }
> +}
> +
> +/*
> + * Trap handler for DBG[BW][CV]Rn_EL1 and MDSCR_EL1. We track the
> + * "dirtiness" of the registers.
> + */
> +static bool trap_debug_regs(struct kvm_vcpu *vcpu,
> +                           const struct sys_reg_params *p,
> +                           const struct sys_reg_desc *r)
> +{
> +       /*
> +        * The best thing to do would be to trap MDSCR_EL1
> +        * independently, test if DBG_MDSCR_KDE or DBG_MDSCR_MDE is
> +        * getting set, and only set the DIRTY bit in that case.
> +        *
> +        * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1
> +        * like a woodpecker on a tree, and it is better to disable
> +        * trapping as soon as possible in this case. Some day, make
> +        * this a tuneable...
> +        */
> +       if (p->is_write) {
> +               vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> +               vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
> +       }
> +
> +       return true;
> +}
> +
>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  {
>         u64 amair;
> @@ -189,6 +242,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>         vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
>  }
>
> +/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go*/
> +#define DBG_BCR_BVR_WCR_WVR_EL1(n)                                     \
> +       /* DBGBVRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),     \
> +         trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },         \
> +       /* DBGBCRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),     \
> +         trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },         \
> +       /* DBGWVRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),     \
> +         trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 },         \
> +       /* DBGWCRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),     \
> +         trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 }
> +
>  /*
>   * Architected system registers.
>   * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> @@ -200,9 +268,6 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>   * Therefore we tell the guest we have 0 counters.  Unfortunately, we
>   * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>   * all PM registers, which doesn't crash the guest kernel at least.
> - *
> - * Same goes for the whole debug infrastructure, which probably breaks
> - * some guest functionnality. This should be fixed.
>   */
>  static const struct sys_reg_desc sys_reg_descs[] = {
>         /* DC ISW */
> @@ -215,12 +280,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>         { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
>           access_dcsw },
>
> +       DBG_BCR_BVR_WCR_WVR_EL1(0),
> +       DBG_BCR_BVR_WCR_WVR_EL1(1),
> +       /* MDCCINT_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
> +         trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
> +       /* MDSCR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
> +         trap_debug_regs, reset_val, MDSCR_EL1, 0 },
> +       DBG_BCR_BVR_WCR_WVR_EL1(2),
> +       DBG_BCR_BVR_WCR_WVR_EL1(3),
> +       DBG_BCR_BVR_WCR_WVR_EL1(4),
> +       DBG_BCR_BVR_WCR_WVR_EL1(5),
> +       DBG_BCR_BVR_WCR_WVR_EL1(6),
> +       DBG_BCR_BVR_WCR_WVR_EL1(7),
> +       DBG_BCR_BVR_WCR_WVR_EL1(8),
> +       DBG_BCR_BVR_WCR_WVR_EL1(9),
> +       DBG_BCR_BVR_WCR_WVR_EL1(10),
> +       DBG_BCR_BVR_WCR_WVR_EL1(11),
> +       DBG_BCR_BVR_WCR_WVR_EL1(12),
> +       DBG_BCR_BVR_WCR_WVR_EL1(13),
> +       DBG_BCR_BVR_WCR_WVR_EL1(14),
> +       DBG_BCR_BVR_WCR_WVR_EL1(15),
> +
> +       /* MDRAR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
> +         trap_wi_raz },
> +       /* OSLAR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
> +         trap_wi_raz },
> +       /* OSLSR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
> +         trap_oslsr_el1 },
> +       /* OSDLR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
> +         trap_wi_raz },
> +       /* DBGPRCR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
> +         trap_wi_raz },
> +       /* DBGCLAIMSET_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
> +         trap_wi_raz },
> +       /* DBGCLAIMCLR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
> +         trap_wi_raz },
> +       /* DBGAUTHSTATUS_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
> +         trap_dbgauthstatus_el1 },
> +
>         /* TEECR32_EL1 */
>         { Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
>           NULL, reset_val, TEECR32_EL1, 0 },
>         /* TEEHBR32_EL1 */
>         { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
>           NULL, reset_val, TEEHBR32_EL1, 0 },
> +
> +       /* MDCCSR_EL1 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
> +         trap_wi_raz },
> +       /* DBGDTR_EL0 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
> +         trap_wi_raz },
> +       /* DBGDTR[TR]X_EL0 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
> +         trap_wi_raz },
> +
>         /* DBGVCR32_EL2 */
>         { Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
>           NULL, reset_val, DBGVCR32_EL2, 0 },
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 3/9] arm64: KVM: add trap handlers for AArch64 debug registers
@ 2014-05-19  8:27     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Add handlers for all the AArch64 debug registers that are accessible
> from EL0 or EL1. The trapping code keeps track of the state of the
> debug registers, allowing for the switch code to implement a lazy
> switching strategy.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h  |  28 ++++++--
>  arch/arm64/include/asm/kvm_host.h |   3 +
>  arch/arm64/kvm/sys_regs.c         | 130 +++++++++++++++++++++++++++++++++++++-
>  3 files changed, 151 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9fcd54b..e6b159a 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -43,14 +43,25 @@
>  #define        AMAIR_EL1       19      /* Aux Memory Attribute Indirection Register */
>  #define        CNTKCTL_EL1     20      /* Timer Control Register (EL1) */
>  #define        PAR_EL1         21      /* Physical Address Register */
> +#define MDSCR_EL1      22      /* Monitor Debug System Control Register */
> +#define DBGBCR0_EL1    23      /* Debug Breakpoint Control Registers (0-15) */
> +#define DBGBCR15_EL1   38
> +#define DBGBVR0_EL1    39      /* Debug Breakpoint Value Registers (0-15) */
> +#define DBGBVR15_EL1   54
> +#define DBGWCR0_EL1    55      /* Debug Watchpoint Control Registers (0-15) */
> +#define DBGWCR15_EL1   70
> +#define DBGWVR0_EL1    71      /* Debug Watchpoint Value Registers (0-15) */
> +#define DBGWVR15_EL1   86
> +#define MDCCINT_EL1    87      /* Monitor Debug Comms Channel Interrupt Enable Reg */
> +
>  /* 32bit specific registers. Keep them at the end of the range */
> -#define        DACR32_EL2      22      /* Domain Access Control Register */
> -#define        IFSR32_EL2      23      /* Instruction Fault Status Register */
> -#define        FPEXC32_EL2     24      /* Floating-Point Exception Control Register */
> -#define        DBGVCR32_EL2    25      /* Debug Vector Catch Register */
> -#define        TEECR32_EL1     26      /* ThumbEE Configuration Register */
> -#define        TEEHBR32_EL1    27      /* ThumbEE Handler Base Register */
> -#define        NR_SYS_REGS     28
> +#define        DACR32_EL2      88      /* Domain Access Control Register */
> +#define        IFSR32_EL2      89      /* Instruction Fault Status Register */
> +#define        FPEXC32_EL2     90      /* Floating-Point Exception Control Register */
> +#define        DBGVCR32_EL2    91      /* Debug Vector Catch Register */
> +#define        TEECR32_EL1     92      /* ThumbEE Configuration Register */
> +#define        TEEHBR32_EL1    93      /* ThumbEE Handler Base Register */
> +#define        NR_SYS_REGS     94
>
>  /* 32bit mapping */
>  #define c0_MPIDR       (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
> @@ -87,6 +98,9 @@
>  #define ARM_EXCEPTION_IRQ        0
>  #define ARM_EXCEPTION_TRAP       1
>
> +#define KVM_ARM64_DEBUG_DIRTY_SHIFT    0
> +#define KVM_ARM64_DEBUG_DIRTY          (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 0a1d697..4737961 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -101,6 +101,9 @@ struct kvm_vcpu_arch {
>         /* Exception Information */
>         struct kvm_vcpu_fault_info fault;
>
> +       /* Debug state */
> +       u64 debug_flags;
> +
>         /* Pointer to host CPU context */
>         kvm_cpu_context_t *host_cpu_context;
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index fc8d4e3..618d4fb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -30,6 +30,7 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/cacheflush.h>
>  #include <asm/cputype.h>
> +#include <asm/debug-monitors.h>
>  #include <trace/events/kvm.h>
>
>  #include "sys_regs.h"
> @@ -173,6 +174,58 @@ static bool trap_wi_raz(struct kvm_vcpu *vcpu,
>                 return read_zero(vcpu, p);
>  }
>
> +static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
> +                          const struct sys_reg_params *p,
> +                          const struct sys_reg_desc *r)
> +{
> +       if (p->is_write) {
> +               return ignore_write(vcpu, p);
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = (1 << 3);
> +               return true;
> +       }
> +}
> +
> +static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
> +                                  const struct sys_reg_params *p,
> +                                  const struct sys_reg_desc *r)
> +{
> +       if (p->is_write) {
> +               return ignore_write(vcpu, p);
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = 0x2222; /* Implemented and disabled */
> +               return true;
> +       }
> +}
> +
> +/*
> + * Trap handler for DBG[BW][CV]Rn_EL1 and MDSCR_EL1. We track the
> + * "dirtiness" of the registers.
> + */
> +static bool trap_debug_regs(struct kvm_vcpu *vcpu,
> +                           const struct sys_reg_params *p,
> +                           const struct sys_reg_desc *r)
> +{
> +       /*
> +        * The best thing to do would be to trap MDSCR_EL1
> +        * independently, test if DBG_MDSCR_KDE or DBG_MDSCR_MDE is
> +        * getting set, and only set the DIRTY bit in that case.
> +        *
> +        * Unfortunately, "old" Linux kernels tend to hit MDSCR_EL1
> +        * like a woodpecker on a tree, and it is better to disable
> +        * trapping as soon as possible in this case. Some day, make
> +        * this a tuneable...
> +        */
> +       if (p->is_write) {
> +               vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> +               vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> +       } else {
> +               *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
> +       }
> +
> +       return true;
> +}
> +
>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  {
>         u64 amair;
> @@ -189,6 +242,21 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>         vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
>  }
>
> +/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go*/
> +#define DBG_BCR_BVR_WCR_WVR_EL1(n)                                     \
> +       /* DBGBVRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b100),     \
> +         trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 },         \
> +       /* DBGBCRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b101),     \
> +         trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 },         \
> +       /* DBGWVRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b110),     \
> +         trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 },         \
> +       /* DBGWCRn_EL1 */                                               \
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm((n)), Op2(0b111),     \
> +         trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 }
> +
>  /*
>   * Architected system registers.
>   * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> @@ -200,9 +268,6 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>   * Therefore we tell the guest we have 0 counters.  Unfortunately, we
>   * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
>   * all PM registers, which doesn't crash the guest kernel at least.
> - *
> - * Same goes for the whole debug infrastructure, which probably breaks
> - * some guest functionnality. This should be fixed.
>   */
>  static const struct sys_reg_desc sys_reg_descs[] = {
>         /* DC ISW */
> @@ -215,12 +280,71 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>         { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
>           access_dcsw },
>
> +       DBG_BCR_BVR_WCR_WVR_EL1(0),
> +       DBG_BCR_BVR_WCR_WVR_EL1(1),
> +       /* MDCCINT_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b000),
> +         trap_debug_regs, reset_val, MDCCINT_EL1, 0 },
> +       /* MDSCR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0000), CRm(0b0010), Op2(0b010),
> +         trap_debug_regs, reset_val, MDSCR_EL1, 0 },
> +       DBG_BCR_BVR_WCR_WVR_EL1(2),
> +       DBG_BCR_BVR_WCR_WVR_EL1(3),
> +       DBG_BCR_BVR_WCR_WVR_EL1(4),
> +       DBG_BCR_BVR_WCR_WVR_EL1(5),
> +       DBG_BCR_BVR_WCR_WVR_EL1(6),
> +       DBG_BCR_BVR_WCR_WVR_EL1(7),
> +       DBG_BCR_BVR_WCR_WVR_EL1(8),
> +       DBG_BCR_BVR_WCR_WVR_EL1(9),
> +       DBG_BCR_BVR_WCR_WVR_EL1(10),
> +       DBG_BCR_BVR_WCR_WVR_EL1(11),
> +       DBG_BCR_BVR_WCR_WVR_EL1(12),
> +       DBG_BCR_BVR_WCR_WVR_EL1(13),
> +       DBG_BCR_BVR_WCR_WVR_EL1(14),
> +       DBG_BCR_BVR_WCR_WVR_EL1(15),
> +
> +       /* MDRAR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b000),
> +         trap_wi_raz },
> +       /* OSLAR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0000), Op2(0b100),
> +         trap_wi_raz },
> +       /* OSLSR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100),
> +         trap_oslsr_el1 },
> +       /* OSDLR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100),
> +         trap_wi_raz },
> +       /* DBGPRCR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100),
> +         trap_wi_raz },
> +       /* DBGCLAIMSET_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110),
> +         trap_wi_raz },
> +       /* DBGCLAIMCLR_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110),
> +         trap_wi_raz },
> +       /* DBGAUTHSTATUS_EL1 */
> +       { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
> +         trap_dbgauthstatus_el1 },
> +
>         /* TEECR32_EL1 */
>         { Op0(0b10), Op1(0b010), CRn(0b0000), CRm(0b0000), Op2(0b000),
>           NULL, reset_val, TEECR32_EL1, 0 },
>         /* TEEHBR32_EL1 */
>         { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b0000), Op2(0b000),
>           NULL, reset_val, TEEHBR32_EL1, 0 },
> +
> +       /* MDCCSR_EL1 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0001), Op2(0b000),
> +         trap_wi_raz },
> +       /* DBGDTR_EL0 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0100), Op2(0b000),
> +         trap_wi_raz },
> +       /* DBGDTR[TR]X_EL0 */
> +       { Op0(0b10), Op1(0b011), CRn(0b0000), CRm(0b0101), Op2(0b000),
> +         trap_wi_raz },
> +
>         /* DBGVCR32_EL2 */
>         { Op0(0b10), Op1(0b100), CRn(0b0000), CRm(0b0111), Op2(0b000),
>           NULL, reset_val, DBGVCR32_EL2, 0 },
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:29     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
@ 2014-05-19  8:29     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:29     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps
@ 2014-05-19  8:29     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 6/9] arm64: KVM: check ordering of all system register tables
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:31     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:31 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 6/9] arm64: KVM: check ordering of all system register tables
@ 2014-05-19  8:31     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:33     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 7/9] arm64: KVM: add trap handlers for AArch32 debug registers
@ 2014-05-19  8:33     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:33 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:38     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:38 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Implement switching of the debug registers. While the number
> of registers is massive, CPUs usually don't implement them all
> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
> of 22 registers "only").
>
> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
> or when we've flagged the debug registers as dirty. It means that
> most of the time, we only save/restore MDSCR_EL1.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kernel/asm-offsets.c |   1 +
>  arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 444 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 646f888..ae73a83 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -120,6 +120,7 @@ int main(void)
>    DEFINE(VCPU_ESR_EL2,         offsetof(struct kvm_vcpu, arch.fault.esr_el2));
>    DEFINE(VCPU_FAR_EL2,         offsetof(struct kvm_vcpu, arch.fault.far_el2));
>    DEFINE(VCPU_HPFAR_EL2,       offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
> +  DEFINE(VCPU_DEBUG_FLAGS,     offsetof(struct kvm_vcpu, arch.debug_flags));
>    DEFINE(VCPU_HCR_EL2,         offsetof(struct kvm_vcpu, arch.hcr_el2));
>    DEFINE(VCPU_IRQ_LINES,       offsetof(struct kvm_vcpu, arch.irq_lines));
>    DEFINE(VCPU_HOST_CONTEXT,    offsetof(struct kvm_vcpu, arch.host_cpu_context));
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 2c56012..f9d5a1d 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -21,6 +21,7 @@
>  #include <asm/assembler.h>
>  #include <asm/memory.h>
>  #include <asm/asm-offsets.h>
> +#include <asm/debug-monitors.h>
>  #include <asm/fpsimdmacros.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
> @@ -215,6 +216,7 @@ __kvm_hyp_code_start:
>         mrs     x22,    amair_el1
>         mrs     x23,    cntkctl_el1
>         mrs     x24,    par_el1
> +       mrs     x25,    mdscr_el1
>
>         stp     x4, x5, [x3]
>         stp     x6, x7, [x3, #16]
> @@ -226,7 +228,202 @@ __kvm_hyp_code_start:
>         stp     x18, x19, [x3, #112]
>         stp     x20, x21, [x3, #128]
>         stp     x22, x23, [x3, #144]
> -       str     x24, [x3, #160]
> +       stp     x24, x25, [x3, #160]
> +.endm
> +
> +.macro save_debug
> +       // x2: base address for cpu context
> +       // x3: tmp register
> +
> +       mrs     x26, id_aa64dfr0_el1
> +       ubfx    x24, x26, #12, #4       // Extract BRPs
> +       ubfx    x25, x26, #20, #4       // Extract WRPs
> +       mov     w26, #15
> +       sub     w24, w26, w24           // How many BPs to skip
> +       sub     w25, w26, w25           // How many WPs to skip
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgbcr15_el1
> +       mrs     x19, dbgbcr14_el1
> +       mrs     x18, dbgbcr13_el1
> +       mrs     x17, dbgbcr12_el1
> +       mrs     x16, dbgbcr11_el1
> +       mrs     x15, dbgbcr10_el1
> +       mrs     x14, dbgbcr9_el1
> +       mrs     x13, dbgbcr8_el1
> +       mrs     x12, dbgbcr7_el1
> +       mrs     x11, dbgbcr6_el1
> +       mrs     x10, dbgbcr5_el1
> +       mrs     x9, dbgbcr4_el1
> +       mrs     x8, dbgbcr3_el1
> +       mrs     x7, dbgbcr2_el1
> +       mrs     x6, dbgbcr1_el1
> +       mrs     x5, dbgbcr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgbvr15_el1
> +       mrs     x19, dbgbvr14_el1
> +       mrs     x18, dbgbvr13_el1
> +       mrs     x17, dbgbvr12_el1
> +       mrs     x16, dbgbvr11_el1
> +       mrs     x15, dbgbvr10_el1
> +       mrs     x14, dbgbvr9_el1
> +       mrs     x13, dbgbvr8_el1
> +       mrs     x12, dbgbvr7_el1
> +       mrs     x11, dbgbvr6_el1
> +       mrs     x10, dbgbvr5_el1
> +       mrs     x9, dbgbvr4_el1
> +       mrs     x8, dbgbvr3_el1
> +       mrs     x7, dbgbvr2_el1
> +       mrs     x6, dbgbvr1_el1
> +       mrs     x5, dbgbvr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgwcr15_el1
> +       mrs     x19, dbgwcr14_el1
> +       mrs     x18, dbgwcr13_el1
> +       mrs     x17, dbgwcr12_el1
> +       mrs     x16, dbgwcr11_el1
> +       mrs     x15, dbgwcr10_el1
> +       mrs     x14, dbgwcr9_el1
> +       mrs     x13, dbgwcr8_el1
> +       mrs     x12, dbgwcr7_el1
> +       mrs     x11, dbgwcr6_el1
> +       mrs     x10, dbgwcr5_el1
> +       mrs     x9, dbgwcr4_el1
> +       mrs     x8, dbgwcr3_el1
> +       mrs     x7, dbgwcr2_el1
> +       mrs     x6, dbgwcr1_el1
> +       mrs     x5, dbgwcr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgwvr15_el1
> +       mrs     x19, dbgwvr14_el1
> +       mrs     x18, dbgwvr13_el1
> +       mrs     x17, dbgwvr12_el1
> +       mrs     x16, dbgwvr11_el1
> +       mrs     x15, dbgwvr10_el1
> +       mrs     x14, dbgwvr9_el1
> +       mrs     x13, dbgwvr8_el1
> +       mrs     x12, dbgwvr7_el1
> +       mrs     x11, dbgwvr6_el1
> +       mrs     x10, dbgwvr5_el1
> +       mrs     x9, dbgwvr4_el1
> +       mrs     x8, dbgwvr3_el1
> +       mrs     x7, dbgwvr2_el1
> +       mrs     x6, dbgwvr1_el1
> +       mrs     x5, dbgwvr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       mrs     x21, mdccint_el1
> +       str     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>  .endm
>
>  .macro restore_sysregs
> @@ -245,7 +442,7 @@ __kvm_hyp_code_start:
>         ldp     x18, x19, [x3, #112]
>         ldp     x20, x21, [x3, #128]
>         ldp     x22, x23, [x3, #144]
> -       ldr     x24, [x3, #160]
> +       ldp     x24, x25, [x3, #160]
>
>         msr     vmpidr_el2,     x4
>         msr     csselr_el1,     x5
> @@ -268,6 +465,198 @@ __kvm_hyp_code_start:
>         msr     amair_el1,      x22
>         msr     cntkctl_el1,    x23
>         msr     par_el1,        x24
> +       msr     mdscr_el1,      x25
> +.endm
> +
> +.macro restore_debug
> +       // x2: base address for cpu context
> +       // x3: tmp register
> +
> +       mrs     x26, id_aa64dfr0_el1
> +       ubfx    x24, x26, #12, #4       // Extract BRPs
> +       ubfx    x25, x26, #20, #4       // Extract WRPs
> +       mov     w26, #15
> +       sub     w24, w26, w24           // How many BPs to skip
> +       sub     w25, w26, w25           // How many WPs to skip
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       msr     dbgbcr15_el1, x20
> +       msr     dbgbcr14_el1, x19
> +       msr     dbgbcr13_el1, x18
> +       msr     dbgbcr12_el1, x17
> +       msr     dbgbcr11_el1, x16
> +       msr     dbgbcr10_el1, x15
> +       msr     dbgbcr9_el1, x14
> +       msr     dbgbcr8_el1, x13
> +       msr     dbgbcr7_el1, x12
> +       msr     dbgbcr6_el1, x11
> +       msr     dbgbcr5_el1, x10
> +       msr     dbgbcr4_el1, x9
> +       msr     dbgbcr3_el1, x8
> +       msr     dbgbcr2_el1, x7
> +       msr     dbgbcr1_el1, x6
> +       msr     dbgbcr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       msr     dbgbvr15_el1, x20
> +       msr     dbgbvr14_el1, x19
> +       msr     dbgbvr13_el1, x18
> +       msr     dbgbvr12_el1, x17
> +       msr     dbgbvr11_el1, x16
> +       msr     dbgbvr10_el1, x15
> +       msr     dbgbvr9_el1, x14
> +       msr     dbgbvr8_el1, x13
> +       msr     dbgbvr7_el1, x12
> +       msr     dbgbvr6_el1, x11
> +       msr     dbgbvr5_el1, x10
> +       msr     dbgbvr4_el1, x9
> +       msr     dbgbvr3_el1, x8
> +       msr     dbgbvr2_el1, x7
> +       msr     dbgbvr1_el1, x6
> +       msr     dbgbvr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       msr     dbgwcr15_el1, x20
> +       msr     dbgwcr14_el1, x19
> +       msr     dbgwcr13_el1, x18
> +       msr     dbgwcr12_el1, x17
> +       msr     dbgwcr11_el1, x16
> +       msr     dbgwcr10_el1, x15
> +       msr     dbgwcr9_el1, x14
> +       msr     dbgwcr8_el1, x13
> +       msr     dbgwcr7_el1, x12
> +       msr     dbgwcr6_el1, x11
> +       msr     dbgwcr5_el1, x10
> +       msr     dbgwcr4_el1, x9
> +       msr     dbgwcr3_el1, x8
> +       msr     dbgwcr2_el1, x7
> +       msr     dbgwcr1_el1, x6
> +       msr     dbgwcr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       msr     dbgwvr15_el1, x20
> +       msr     dbgwvr14_el1, x19
> +       msr     dbgwvr13_el1, x18
> +       msr     dbgwvr12_el1, x17
> +       msr     dbgwvr11_el1, x16
> +       msr     dbgwvr10_el1, x15
> +       msr     dbgwvr9_el1, x14
> +       msr     dbgwvr8_el1, x13
> +       msr     dbgwvr7_el1, x12
> +       msr     dbgwvr6_el1, x11
> +       msr     dbgwvr5_el1, x10
> +       msr     dbgwvr4_el1, x9
> +       msr     dbgwvr3_el1, x8
> +       msr     dbgwvr2_el1, x7
> +       msr     dbgwvr1_el1, x6
> +       msr     dbgwvr0_el1, x5
> +
> +       ldr     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
> +       msr     mdccint_el1, x21
>  .endm
>
>  .macro skip_32bit_state tmp, target
> @@ -282,6 +671,11 @@ __kvm_hyp_code_start:
>         tbz     \tmp, #12, \target
>  .endm
>
> +.macro skip_clean_debug_state tmp, target
> +       ldr     \tmp, [x0, #VCPU_DEBUG_FLAGS]
> +       tbz     \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
> +.endm
> +

At first glance, I was little confused by the macro name
skip_clean_debug_state.

I suggest to rename this macro to something like:
jump_to_label_if_debug_clean.

>  .macro save_guest_32bit_state
>         skip_32bit_state x3, 1f
>
> @@ -297,10 +691,13 @@ __kvm_hyp_code_start:
>         mrs     x4, dacr32_el2
>         mrs     x5, ifsr32_el2
>         mrs     x6, fpexc32_el2
> -       mrs     x7, dbgvcr32_el2
>         stp     x4, x5, [x3]
> -       stp     x6, x7, [x3, #16]
> +       str     x6, [x3, #16]
>
> +       skip_clean_debug_state x8, 2f
> +       mrs     x7, dbgvcr32_el2
> +       str     x7, [x3, #24]
> +2:
>         skip_tee_state x8, 1f
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -323,12 +720,15 @@ __kvm_hyp_code_start:
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>         ldp     x4, x5, [x3]
> -       ldp     x6, x7, [x3, #16]
> +       ldr     x6, [x3, #16]
>         msr     dacr32_el2, x4
>         msr     ifsr32_el2, x5
>         msr     fpexc32_el2, x6
> -       msr     dbgvcr32_el2, x7
>
> +       skip_clean_debug_state x8, 2f
> +       ldr     x7, [x3, #24]
> +       msr     dbgvcr32_el2, x7
> +2:
>         skip_tee_state x8, 1f
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -537,6 +937,14 @@ __restore_sysregs:
>         restore_sysregs
>         ret
>
> +__save_debug:
> +       save_debug
> +       ret
> +
> +__restore_debug:
> +       restore_debug
> +       ret
> +
>  __save_fpsimd:
>         save_fpsimd
>         ret
> @@ -568,6 +976,21 @@ ENTRY(__kvm_vcpu_run)
>         bl __save_fpsimd
>         bl __save_sysregs
>
> +       // Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
> +       // is set, we do a full save/restore cycle and disable trapping.
> +       add     x25, x0, #VCPU_CONTEXT
> +       ldr     x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
> +       and     x26, x25, #DBG_MDSCR_KDE
> +       and     x25, x25, #DBG_MDSCR_MDE
> +       adds    xzr, x25, x26
> +       mov     x26, #KVM_ARM64_DEBUG_DIRTY
> +       csel    x25, x26, xzr, ne
> +       ldr     x26, [x0, #VCPU_DEBUG_FLAGS]
> +       orr     x26, x26, x25
> +       cbz     x26, 1f
> +       str     x26, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __save_debug
> +1:
>         activate_traps
>         activate_vm
>
> @@ -579,6 +1002,10 @@ ENTRY(__kvm_vcpu_run)
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       bl      __restore_debug
> +1:
>         restore_guest_32bit_state
>         restore_guest_regs
>
> @@ -595,6 +1022,10 @@ __kvm_vcpu_return:
>         save_guest_regs
>         bl __save_fpsimd
>         bl __save_sysregs
> +
> +       skip_clean_debug_state x3, 1f
> +       bl      __save_debug
> +1:
>         save_guest_32bit_state
>
>         save_timer_state
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Other than the minor comment mentioned above this looks
good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
@ 2014-05-19  8:38     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Implement switching of the debug registers. While the number
> of registers is massive, CPUs usually don't implement them all
> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
> of 22 registers "only").
>
> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
> or when we've flagged the debug registers as dirty. It means that
> most of the time, we only save/restore MDSCR_EL1.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kernel/asm-offsets.c |   1 +
>  arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 444 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 646f888..ae73a83 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -120,6 +120,7 @@ int main(void)
>    DEFINE(VCPU_ESR_EL2,         offsetof(struct kvm_vcpu, arch.fault.esr_el2));
>    DEFINE(VCPU_FAR_EL2,         offsetof(struct kvm_vcpu, arch.fault.far_el2));
>    DEFINE(VCPU_HPFAR_EL2,       offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
> +  DEFINE(VCPU_DEBUG_FLAGS,     offsetof(struct kvm_vcpu, arch.debug_flags));
>    DEFINE(VCPU_HCR_EL2,         offsetof(struct kvm_vcpu, arch.hcr_el2));
>    DEFINE(VCPU_IRQ_LINES,       offsetof(struct kvm_vcpu, arch.irq_lines));
>    DEFINE(VCPU_HOST_CONTEXT,    offsetof(struct kvm_vcpu, arch.host_cpu_context));
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 2c56012..f9d5a1d 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -21,6 +21,7 @@
>  #include <asm/assembler.h>
>  #include <asm/memory.h>
>  #include <asm/asm-offsets.h>
> +#include <asm/debug-monitors.h>
>  #include <asm/fpsimdmacros.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
> @@ -215,6 +216,7 @@ __kvm_hyp_code_start:
>         mrs     x22,    amair_el1
>         mrs     x23,    cntkctl_el1
>         mrs     x24,    par_el1
> +       mrs     x25,    mdscr_el1
>
>         stp     x4, x5, [x3]
>         stp     x6, x7, [x3, #16]
> @@ -226,7 +228,202 @@ __kvm_hyp_code_start:
>         stp     x18, x19, [x3, #112]
>         stp     x20, x21, [x3, #128]
>         stp     x22, x23, [x3, #144]
> -       str     x24, [x3, #160]
> +       stp     x24, x25, [x3, #160]
> +.endm
> +
> +.macro save_debug
> +       // x2: base address for cpu context
> +       // x3: tmp register
> +
> +       mrs     x26, id_aa64dfr0_el1
> +       ubfx    x24, x26, #12, #4       // Extract BRPs
> +       ubfx    x25, x26, #20, #4       // Extract WRPs
> +       mov     w26, #15
> +       sub     w24, w26, w24           // How many BPs to skip
> +       sub     w25, w26, w25           // How many WPs to skip
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgbcr15_el1
> +       mrs     x19, dbgbcr14_el1
> +       mrs     x18, dbgbcr13_el1
> +       mrs     x17, dbgbcr12_el1
> +       mrs     x16, dbgbcr11_el1
> +       mrs     x15, dbgbcr10_el1
> +       mrs     x14, dbgbcr9_el1
> +       mrs     x13, dbgbcr8_el1
> +       mrs     x12, dbgbcr7_el1
> +       mrs     x11, dbgbcr6_el1
> +       mrs     x10, dbgbcr5_el1
> +       mrs     x9, dbgbcr4_el1
> +       mrs     x8, dbgbcr3_el1
> +       mrs     x7, dbgbcr2_el1
> +       mrs     x6, dbgbcr1_el1
> +       mrs     x5, dbgbcr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgbvr15_el1
> +       mrs     x19, dbgbvr14_el1
> +       mrs     x18, dbgbvr13_el1
> +       mrs     x17, dbgbvr12_el1
> +       mrs     x16, dbgbvr11_el1
> +       mrs     x15, dbgbvr10_el1
> +       mrs     x14, dbgbvr9_el1
> +       mrs     x13, dbgbvr8_el1
> +       mrs     x12, dbgbvr7_el1
> +       mrs     x11, dbgbvr6_el1
> +       mrs     x10, dbgbvr5_el1
> +       mrs     x9, dbgbvr4_el1
> +       mrs     x8, dbgbvr3_el1
> +       mrs     x7, dbgbvr2_el1
> +       mrs     x6, dbgbvr1_el1
> +       mrs     x5, dbgbvr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgwcr15_el1
> +       mrs     x19, dbgwcr14_el1
> +       mrs     x18, dbgwcr13_el1
> +       mrs     x17, dbgwcr12_el1
> +       mrs     x16, dbgwcr11_el1
> +       mrs     x15, dbgwcr10_el1
> +       mrs     x14, dbgwcr9_el1
> +       mrs     x13, dbgwcr8_el1
> +       mrs     x12, dbgwcr7_el1
> +       mrs     x11, dbgwcr6_el1
> +       mrs     x10, dbgwcr5_el1
> +       mrs     x9, dbgwcr4_el1
> +       mrs     x8, dbgwcr3_el1
> +       mrs     x7, dbgwcr2_el1
> +       mrs     x6, dbgwcr1_el1
> +       mrs     x5, dbgwcr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       mrs     x20, dbgwvr15_el1
> +       mrs     x19, dbgwvr14_el1
> +       mrs     x18, dbgwvr13_el1
> +       mrs     x17, dbgwvr12_el1
> +       mrs     x16, dbgwvr11_el1
> +       mrs     x15, dbgwvr10_el1
> +       mrs     x14, dbgwvr9_el1
> +       mrs     x13, dbgwvr8_el1
> +       mrs     x12, dbgwvr7_el1
> +       mrs     x11, dbgwvr6_el1
> +       mrs     x10, dbgwvr5_el1
> +       mrs     x9, dbgwvr4_el1
> +       mrs     x8, dbgwvr3_el1
> +       mrs     x7, dbgwvr2_el1
> +       mrs     x6, dbgwvr1_el1
> +       mrs     x5, dbgwvr0_el1
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +
> +1:
> +       str     x20, [x3, #(15 * 8)]
> +       str     x19, [x3, #(14 * 8)]
> +       str     x18, [x3, #(13 * 8)]
> +       str     x17, [x3, #(12 * 8)]
> +       str     x16, [x3, #(11 * 8)]
> +       str     x15, [x3, #(10 * 8)]
> +       str     x14, [x3, #(9 * 8)]
> +       str     x13, [x3, #(8 * 8)]
> +       str     x12, [x3, #(7 * 8)]
> +       str     x11, [x3, #(6 * 8)]
> +       str     x10, [x3, #(5 * 8)]
> +       str     x9, [x3, #(4 * 8)]
> +       str     x8, [x3, #(3 * 8)]
> +       str     x7, [x3, #(2 * 8)]
> +       str     x6, [x3, #(1 * 8)]
> +       str     x5, [x3, #(0 * 8)]
> +
> +       mrs     x21, mdccint_el1
> +       str     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>  .endm
>
>  .macro restore_sysregs
> @@ -245,7 +442,7 @@ __kvm_hyp_code_start:
>         ldp     x18, x19, [x3, #112]
>         ldp     x20, x21, [x3, #128]
>         ldp     x22, x23, [x3, #144]
> -       ldr     x24, [x3, #160]
> +       ldp     x24, x25, [x3, #160]
>
>         msr     vmpidr_el2,     x4
>         msr     csselr_el1,     x5
> @@ -268,6 +465,198 @@ __kvm_hyp_code_start:
>         msr     amair_el1,      x22
>         msr     cntkctl_el1,    x23
>         msr     par_el1,        x24
> +       msr     mdscr_el1,      x25
> +.endm
> +
> +.macro restore_debug
> +       // x2: base address for cpu context
> +       // x3: tmp register
> +
> +       mrs     x26, id_aa64dfr0_el1
> +       ubfx    x24, x26, #12, #4       // Extract BRPs
> +       ubfx    x25, x26, #20, #4       // Extract WRPs
> +       mov     w26, #15
> +       sub     w24, w26, w24           // How many BPs to skip
> +       sub     w25, w26, w25           // How many WPs to skip
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       msr     dbgbcr15_el1, x20
> +       msr     dbgbcr14_el1, x19
> +       msr     dbgbcr13_el1, x18
> +       msr     dbgbcr12_el1, x17
> +       msr     dbgbcr11_el1, x16
> +       msr     dbgbcr10_el1, x15
> +       msr     dbgbcr9_el1, x14
> +       msr     dbgbcr8_el1, x13
> +       msr     dbgbcr7_el1, x12
> +       msr     dbgbcr6_el1, x11
> +       msr     dbgbcr5_el1, x10
> +       msr     dbgbcr4_el1, x9
> +       msr     dbgbcr3_el1, x8
> +       msr     dbgbcr2_el1, x7
> +       msr     dbgbcr1_el1, x6
> +       msr     dbgbcr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x24, lsl #2
> +       br      x26
> +1:
> +       msr     dbgbvr15_el1, x20
> +       msr     dbgbvr14_el1, x19
> +       msr     dbgbvr13_el1, x18
> +       msr     dbgbvr12_el1, x17
> +       msr     dbgbvr11_el1, x16
> +       msr     dbgbvr10_el1, x15
> +       msr     dbgbvr9_el1, x14
> +       msr     dbgbvr8_el1, x13
> +       msr     dbgbvr7_el1, x12
> +       msr     dbgbvr6_el1, x11
> +       msr     dbgbvr5_el1, x10
> +       msr     dbgbvr4_el1, x9
> +       msr     dbgbvr3_el1, x8
> +       msr     dbgbvr2_el1, x7
> +       msr     dbgbvr1_el1, x6
> +       msr     dbgbvr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       msr     dbgwcr15_el1, x20
> +       msr     dbgwcr14_el1, x19
> +       msr     dbgwcr13_el1, x18
> +       msr     dbgwcr12_el1, x17
> +       msr     dbgwcr11_el1, x16
> +       msr     dbgwcr10_el1, x15
> +       msr     dbgwcr9_el1, x14
> +       msr     dbgwcr8_el1, x13
> +       msr     dbgwcr7_el1, x12
> +       msr     dbgwcr6_el1, x11
> +       msr     dbgwcr5_el1, x10
> +       msr     dbgwcr4_el1, x9
> +       msr     dbgwcr3_el1, x8
> +       msr     dbgwcr2_el1, x7
> +       msr     dbgwcr1_el1, x6
> +       msr     dbgwcr0_el1, x5
> +
> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       ldr     x20, [x3, #(15 * 8)]
> +       ldr     x19, [x3, #(14 * 8)]
> +       ldr     x18, [x3, #(13 * 8)]
> +       ldr     x17, [x3, #(12 * 8)]
> +       ldr     x16, [x3, #(11 * 8)]
> +       ldr     x15, [x3, #(10 * 8)]
> +       ldr     x14, [x3, #(9 * 8)]
> +       ldr     x13, [x3, #(8 * 8)]
> +       ldr     x12, [x3, #(7 * 8)]
> +       ldr     x11, [x3, #(6 * 8)]
> +       ldr     x10, [x3, #(5 * 8)]
> +       ldr     x9, [x3, #(4 * 8)]
> +       ldr     x8, [x3, #(3 * 8)]
> +       ldr     x7, [x3, #(2 * 8)]
> +       ldr     x6, [x3, #(1 * 8)]
> +       ldr     x5, [x3, #(0 * 8)]
> +
> +       adr     x26, 1f
> +       add     x26, x26, x25, lsl #2
> +       br      x26
> +1:
> +       msr     dbgwvr15_el1, x20
> +       msr     dbgwvr14_el1, x19
> +       msr     dbgwvr13_el1, x18
> +       msr     dbgwvr12_el1, x17
> +       msr     dbgwvr11_el1, x16
> +       msr     dbgwvr10_el1, x15
> +       msr     dbgwvr9_el1, x14
> +       msr     dbgwvr8_el1, x13
> +       msr     dbgwvr7_el1, x12
> +       msr     dbgwvr6_el1, x11
> +       msr     dbgwvr5_el1, x10
> +       msr     dbgwvr4_el1, x9
> +       msr     dbgwvr3_el1, x8
> +       msr     dbgwvr2_el1, x7
> +       msr     dbgwvr1_el1, x6
> +       msr     dbgwvr0_el1, x5
> +
> +       ldr     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
> +       msr     mdccint_el1, x21
>  .endm
>
>  .macro skip_32bit_state tmp, target
> @@ -282,6 +671,11 @@ __kvm_hyp_code_start:
>         tbz     \tmp, #12, \target
>  .endm
>
> +.macro skip_clean_debug_state tmp, target
> +       ldr     \tmp, [x0, #VCPU_DEBUG_FLAGS]
> +       tbz     \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
> +.endm
> +

At first glance, I was little confused by the macro name
skip_clean_debug_state.

I suggest to rename this macro to something like:
jump_to_label_if_debug_clean.

>  .macro save_guest_32bit_state
>         skip_32bit_state x3, 1f
>
> @@ -297,10 +691,13 @@ __kvm_hyp_code_start:
>         mrs     x4, dacr32_el2
>         mrs     x5, ifsr32_el2
>         mrs     x6, fpexc32_el2
> -       mrs     x7, dbgvcr32_el2
>         stp     x4, x5, [x3]
> -       stp     x6, x7, [x3, #16]
> +       str     x6, [x3, #16]
>
> +       skip_clean_debug_state x8, 2f
> +       mrs     x7, dbgvcr32_el2
> +       str     x7, [x3, #24]
> +2:
>         skip_tee_state x8, 1f
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -323,12 +720,15 @@ __kvm_hyp_code_start:
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
>         ldp     x4, x5, [x3]
> -       ldp     x6, x7, [x3, #16]
> +       ldr     x6, [x3, #16]
>         msr     dacr32_el2, x4
>         msr     ifsr32_el2, x5
>         msr     fpexc32_el2, x6
> -       msr     dbgvcr32_el2, x7
>
> +       skip_clean_debug_state x8, 2f
> +       ldr     x7, [x3, #24]
> +       msr     dbgvcr32_el2, x7
> +2:
>         skip_tee_state x8, 1f
>
>         add     x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
> @@ -537,6 +937,14 @@ __restore_sysregs:
>         restore_sysregs
>         ret
>
> +__save_debug:
> +       save_debug
> +       ret
> +
> +__restore_debug:
> +       restore_debug
> +       ret
> +
>  __save_fpsimd:
>         save_fpsimd
>         ret
> @@ -568,6 +976,21 @@ ENTRY(__kvm_vcpu_run)
>         bl __save_fpsimd
>         bl __save_sysregs
>
> +       // Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
> +       // is set, we do a full save/restore cycle and disable trapping.
> +       add     x25, x0, #VCPU_CONTEXT
> +       ldr     x25, [x25, #CPU_SYSREG_OFFSET(MDSCR_EL1)]
> +       and     x26, x25, #DBG_MDSCR_KDE
> +       and     x25, x25, #DBG_MDSCR_MDE
> +       adds    xzr, x25, x26
> +       mov     x26, #KVM_ARM64_DEBUG_DIRTY
> +       csel    x25, x26, xzr, ne
> +       ldr     x26, [x0, #VCPU_DEBUG_FLAGS]
> +       orr     x26, x26, x25
> +       cbz     x26, 1f
> +       str     x26, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __save_debug
> +1:
>         activate_traps
>         activate_vm
>
> @@ -579,6 +1002,10 @@ ENTRY(__kvm_vcpu_run)
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       bl      __restore_debug
> +1:
>         restore_guest_32bit_state
>         restore_guest_regs
>
> @@ -595,6 +1022,10 @@ __kvm_vcpu_return:
>         save_guest_regs
>         bl __save_fpsimd
>         bl __save_sysregs
> +
> +       skip_clean_debug_state x3, 1f
> +       bl      __save_debug
> +1:
>         save_guest_32bit_state
>
>         save_timer_state
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Other than the minor comment mentioned above this looks
good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 9/9] arm64: KVM: enable trapping of all debug registers
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:40     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Christoffer Dall, Will Deacon,
	Catalin Marinas, Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 9/9] arm64: KVM: enable trapping of all debug registers
@ 2014-05-19  8:40     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:40 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
  2014-05-07 15:20   ` Marc Zyngier
@ 2014-05-19  8:43     ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:43 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Christoffer Dall, Will Deacon,
	Catalin Marinas, Ian Campbell

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz
@ 2014-05-19  8:43     ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good to me.

FWIW, Reviewed-by: Anup Patel <anup.patel@linaro.org>

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-07 15:20 ` Marc Zyngier
@ 2014-05-19  9:05   ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  9:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> This patch series adds debug support, a key feature missing from the
> KVM/arm64 port.
>
> The main idea is to keep track of whether the debug registers are
> "dirty" (changed by the guest) or not. In this case, perform the usual
> save/restore dance, for one run only. It means we only have a penalty
> if a guest is actually using the debug registers.
>
> The huge amount of registers is properly frightening, but CPUs
> actually only implement a subset of them. Also, there is a number of
> registers we don't bother emulating (things having to do with external
> debug and OSlock).
>
> This has been tested on a Cortex-A57 platform, running both 32 and
> 64bit guests, on top of 3.15-rc4. This code also lives in my tree in
> the kvm-arm64/debug-trap branch.
>
> Marc Zyngier (9):
>   arm64: KVM: rename pm_fake handler to trap_wi_raz
>   arm64: move DBG_MDSCR_* to asm/debug-monitors.h
>   arm64: KVM: add trap handlers for AArch64 debug registers
>   arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
>   arm64: KVM: use separate tables for AArch32 32 and 64bit traps
>   arm64: KVM: check ordering of all system register tables
>   arm64: KVM: add trap handlers for AArch32 debug registers
>   arm64: KVM: implement lazy world switch for debug registers
>   arm64: KVM: enable trapping of all debug registers
>
>  arch/arm64/include/asm/debug-monitors.h |  19 +-
>  arch/arm64/include/asm/kvm_asm.h        |  39 ++-
>  arch/arm64/include/asm/kvm_coproc.h     |   3 +-
>  arch/arm64/include/asm/kvm_host.h       |  12 +-
>  arch/arm64/kernel/asm-offsets.c         |   1 +
>  arch/arm64/kernel/debug-monitors.c      |   9 -
>  arch/arm64/kvm/handle_exit.c            |   4 +-
>  arch/arm64/kvm/hyp.S                    | 457 ++++++++++++++++++++++++++++-
>  arch/arm64/kvm/sys_regs.c               | 494 +++++++++++++++++++++++++++-----
>  9 files changed, 940 insertions(+), 98 deletions(-)
>
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Hi Marc,

Overall the patchset looks good to me.

The debug register usage by Guest will be very rare
so a lazy save/restore makes lot-of-sense here.

The only concern here is that amount of time spend in
world-switch will increase for Guest once Guest starts
accessing debug registers.

I was wondering if it is possible to detect that Guest
has stopped using debug HW and we can mark debug
state as clean. (or something similar)

Regards,
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19  9:05   ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
> This patch series adds debug support, a key feature missing from the
> KVM/arm64 port.
>
> The main idea is to keep track of whether the debug registers are
> "dirty" (changed by the guest) or not. In this case, perform the usual
> save/restore dance, for one run only. It means we only have a penalty
> if a guest is actually using the debug registers.
>
> The huge amount of registers is properly frightening, but CPUs
> actually only implement a subset of them. Also, there is a number of
> registers we don't bother emulating (things having to do with external
> debug and OSlock).
>
> This has been tested on a Cortex-A57 platform, running both 32 and
> 64bit guests, on top of 3.15-rc4. This code also lives in my tree in
> the kvm-arm64/debug-trap branch.
>
> Marc Zyngier (9):
>   arm64: KVM: rename pm_fake handler to trap_wi_raz
>   arm64: move DBG_MDSCR_* to asm/debug-monitors.h
>   arm64: KVM: add trap handlers for AArch64 debug registers
>   arm64: KVM: common infrastructure for handling AArch32 CP14/CP15
>   arm64: KVM: use separate tables for AArch32 32 and 64bit traps
>   arm64: KVM: check ordering of all system register tables
>   arm64: KVM: add trap handlers for AArch32 debug registers
>   arm64: KVM: implement lazy world switch for debug registers
>   arm64: KVM: enable trapping of all debug registers
>
>  arch/arm64/include/asm/debug-monitors.h |  19 +-
>  arch/arm64/include/asm/kvm_asm.h        |  39 ++-
>  arch/arm64/include/asm/kvm_coproc.h     |   3 +-
>  arch/arm64/include/asm/kvm_host.h       |  12 +-
>  arch/arm64/kernel/asm-offsets.c         |   1 +
>  arch/arm64/kernel/debug-monitors.c      |   9 -
>  arch/arm64/kvm/handle_exit.c            |   4 +-
>  arch/arm64/kvm/hyp.S                    | 457 ++++++++++++++++++++++++++++-
>  arch/arm64/kvm/sys_regs.c               | 494 +++++++++++++++++++++++++++-----
>  9 files changed, 940 insertions(+), 98 deletions(-)
>
> --
> 1.8.3.4
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Hi Marc,

Overall the patchset looks good to me.

The debug register usage by Guest will be very rare
so a lazy save/restore makes lot-of-sense here.

The only concern here is that amount of time spend in
world-switch will increase for Guest once Guest starts
accessing debug registers.

I was wondering if it is possible to detect that Guest
has stopped using debug HW and we can mark debug
state as clean. (or something similar)

Regards,
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-19  9:05   ` Anup Patel
@ 2014-05-19  9:28     ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19  9:28 UTC (permalink / raw)
  To: Anup Patel
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel <anup.patel@linaro.org> wrote

Hi Anup,

> Overall the patchset looks good to me.
>
> The debug register usage by Guest will be very rare
> so a lazy save/restore makes lot-of-sense here.
>
> The only concern here is that amount of time spend in
> world-switch will increase for Guest once Guest starts
> accessing debug registers.
>
> I was wondering if it is possible to detect that Guest
> has stopped using debug HW and we can mark debug
> state as clean. (or something similar)

If you look carefully at patch #8 (last hunk of the patch), you'll see
that I always reset the debug state to "clean" at the end of a guest
run:

@@ -609,6 +1040,12 @@ __kvm_vcpu_return:
 
        bl __restore_sysregs
        bl __restore_fpsimd
+
+       skip_clean_debug_state x3, 1f
+       // Clear the dirty flag for the next run
+       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
+       bl      __restore_debug
+1:
        restore_host_regs
 
        mov     x0, x1

This ensures that the guest's debug state will only be reloaded if:

- MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
using the debug infrastructure)
- or the guest has written to a trapped register (which marks the state
as dirty).

I don't think we can do less work than this. Or can we?

Thanks,

        M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19  9:28     ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19  9:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel <anup.patel@linaro.org> wrote

Hi Anup,

> Overall the patchset looks good to me.
>
> The debug register usage by Guest will be very rare
> so a lazy save/restore makes lot-of-sense here.
>
> The only concern here is that amount of time spend in
> world-switch will increase for Guest once Guest starts
> accessing debug registers.
>
> I was wondering if it is possible to detect that Guest
> has stopped using debug HW and we can mark debug
> state as clean. (or something similar)

If you look carefully at patch #8 (last hunk of the patch), you'll see
that I always reset the debug state to "clean" at the end of a guest
run:

@@ -609,6 +1040,12 @@ __kvm_vcpu_return:
 
        bl __restore_sysregs
        bl __restore_fpsimd
+
+       skip_clean_debug_state x3, 1f
+       // Clear the dirty flag for the next run
+       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
+       bl      __restore_debug
+1:
        restore_host_regs
 
        mov     x0, x1

This ensures that the guest's debug state will only be reloaded if:

- MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
using the debug infrastructure)
- or the guest has written to a trapped register (which marks the state
as dirty).

I don't think we can do less work than this. Or can we?

Thanks,

        M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-19  9:28     ` Marc Zyngier
@ 2014-05-19  9:35       ` Anup Patel
  -1 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  9:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On 19 May 2014 14:58, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel <anup.patel@linaro.org> wrote
>
> Hi Anup,
>
>> Overall the patchset looks good to me.
>>
>> The debug register usage by Guest will be very rare
>> so a lazy save/restore makes lot-of-sense here.
>>
>> The only concern here is that amount of time spend in
>> world-switch will increase for Guest once Guest starts
>> accessing debug registers.
>>
>> I was wondering if it is possible to detect that Guest
>> has stopped using debug HW and we can mark debug
>> state as clean. (or something similar)
>
> If you look carefully at patch #8 (last hunk of the patch), you'll see
> that I always reset the debug state to "clean" at the end of a guest
> run:
>
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
>
> This ensures that the guest's debug state will only be reloaded if:
>
> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
> using the debug infrastructure)
> - or the guest has written to a trapped register (which marks the state
> as dirty).

Thanks for pointing out.

Can you add this info as comment in patch#8 where you
clear the dirty flag?

>
> I don't think we can do less work than this. Or can we?
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny.

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19  9:35       ` Anup Patel
  0 siblings, 0 replies; 60+ messages in thread
From: Anup Patel @ 2014-05-19  9:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 19 May 2014 14:58, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel <anup.patel@linaro.org> wrote
>
> Hi Anup,
>
>> Overall the patchset looks good to me.
>>
>> The debug register usage by Guest will be very rare
>> so a lazy save/restore makes lot-of-sense here.
>>
>> The only concern here is that amount of time spend in
>> world-switch will increase for Guest once Guest starts
>> accessing debug registers.
>>
>> I was wondering if it is possible to detect that Guest
>> has stopped using debug HW and we can mark debug
>> state as clean. (or something similar)
>
> If you look carefully at patch #8 (last hunk of the patch), you'll see
> that I always reset the debug state to "clean" at the end of a guest
> run:
>
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
>
> This ensures that the guest's debug state will only be reloaded if:
>
> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
> using the debug infrastructure)
> - or the guest has written to a trapped register (which marks the state
> as dirty).

Thanks for pointing out.

Can you add this info as comment in patch#8 where you
clear the dirty flag?

>
> I don't think we can do less work than this. Or can we?
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny.

--
Anup

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-19  9:35       ` Anup Patel
@ 2014-05-19 12:22         ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 12:22 UTC (permalink / raw)
  To: Anup Patel
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell

On Mon, May 19 2014 at 10:35:58 am BST, Anup Patel <anup.patel@linaro.org> wrote:
> On 19 May 2014 14:58, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel
>> <anup.patel@linaro.org> wrote
>>
>> Hi Anup,
>>
>>> Overall the patchset looks good to me.
>>>
>>> The debug register usage by Guest will be very rare
>>> so a lazy save/restore makes lot-of-sense here.
>>>
>>> The only concern here is that amount of time spend in
>>> world-switch will increase for Guest once Guest starts
>>> accessing debug registers.
>>>
>>> I was wondering if it is possible to detect that Guest
>>> has stopped using debug HW and we can mark debug
>>> state as clean. (or something similar)
>>
>> If you look carefully at patch #8 (last hunk of the patch), you'll see
>> that I always reset the debug state to "clean" at the end of a guest
>> run:
>>
>> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>>
>>         bl __restore_sysregs
>>         bl __restore_fpsimd
>> +
>> +       skip_clean_debug_state x3, 1f
>> +       // Clear the dirty flag for the next run
>> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
>> +       bl      __restore_debug
>> +1:
>>         restore_host_regs
>>
>>         mov     x0, x1
>>
>> This ensures that the guest's debug state will only be reloaded if:
>>
>> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
>> using the debug infrastructure)
>> - or the guest has written to a trapped register (which marks the state
>> as dirty).
>
> Thanks for pointing out.
>
> Can you add this info as comment in patch#8 where you
> clear the dirty flag?

Right. There is already some comments to that effect just above, where
we compute the dirty state, but I think it doesn't hurt to repeat it.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19 12:22         ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 19 2014 at 10:35:58 am BST, Anup Patel <anup.patel@linaro.org> wrote:
> On 19 May 2014 14:58, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On Mon, May 19 2014 at 10:05:42 am BST, Anup Patel
>> <anup.patel@linaro.org> wrote
>>
>> Hi Anup,
>>
>>> Overall the patchset looks good to me.
>>>
>>> The debug register usage by Guest will be very rare
>>> so a lazy save/restore makes lot-of-sense here.
>>>
>>> The only concern here is that amount of time spend in
>>> world-switch will increase for Guest once Guest starts
>>> accessing debug registers.
>>>
>>> I was wondering if it is possible to detect that Guest
>>> has stopped using debug HW and we can mark debug
>>> state as clean. (or something similar)
>>
>> If you look carefully at patch #8 (last hunk of the patch), you'll see
>> that I always reset the debug state to "clean" at the end of a guest
>> run:
>>
>> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>>
>>         bl __restore_sysregs
>>         bl __restore_fpsimd
>> +
>> +       skip_clean_debug_state x3, 1f
>> +       // Clear the dirty flag for the next run
>> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
>> +       bl      __restore_debug
>> +1:
>>         restore_host_regs
>>
>>         mov     x0, x1
>>
>> This ensures that the guest's debug state will only be reloaded if:
>>
>> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
>> using the debug infrastructure)
>> - or the guest has written to a trapped register (which marks the state
>> as dirty).
>
> Thanks for pointing out.
>
> Can you add this info as comment in patch#8 where you
> clear the dirty flag?

Right. There is already some comments to that effect just above, where
we compute the dirty state, but I think it doesn't hurt to repeat it.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-19  9:28     ` Marc Zyngier
@ 2014-05-19 12:32       ` Peter Maydell
  -1 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-19 12:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Anup Patel, Ian Campbell, kvm, Catalin Marinas, Will Deacon,
	kvmarm, linux-arm-kernel

On 19 May 2014 10:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
> If you look carefully at patch #8 (last hunk of the patch), you'll see
> that I always reset the debug state to "clean" at the end of a guest
> run:
>
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
>
> This ensures that the guest's debug state will only be reloaded if:
>
> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
> using the debug infrastructure)
> - or the guest has written to a trapped register (which marks the state
> as dirty).

Do we also handle the case where the guest didn't
write to the trapped register but userspace did (via
the SET_ONE_REG API)? Maybe this just falls out in the
wash or is handled already...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19 12:32       ` Peter Maydell
  0 siblings, 0 replies; 60+ messages in thread
From: Peter Maydell @ 2014-05-19 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 19 May 2014 10:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
> If you look carefully at patch #8 (last hunk of the patch), you'll see
> that I always reset the debug state to "clean" at the end of a guest
> run:
>
> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>
>         bl __restore_sysregs
>         bl __restore_fpsimd
> +
> +       skip_clean_debug_state x3, 1f
> +       // Clear the dirty flag for the next run
> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
> +       bl      __restore_debug
> +1:
>         restore_host_regs
>
>         mov     x0, x1
>
> This ensures that the guest's debug state will only be reloaded if:
>
> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
> using the debug infrastructure)
> - or the guest has written to a trapped register (which marks the state
> as dirty).

Do we also handle the case where the guest didn't
write to the trapped register but userspace did (via
the SET_ONE_REG API)? Maybe this just falls out in the
wash or is handled already...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 0/9] arm64: KVM: debug infrastructure support
  2014-05-19 12:32       ` Peter Maydell
@ 2014-05-19 12:59         ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 12:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anup Patel, Ian Campbell, kvm, Catalin Marinas, Will Deacon,
	kvmarm, linux-arm-kernel

On Mon, May 19 2014 at  1:32:28 pm BST, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 19 May 2014 10:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> If you look carefully at patch #8 (last hunk of the patch), you'll see
>> that I always reset the debug state to "clean" at the end of a guest
>> run:
>>
>> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>>
>>         bl __restore_sysregs
>>         bl __restore_fpsimd
>> +
>> +       skip_clean_debug_state x3, 1f
>> +       // Clear the dirty flag for the next run
>> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
>> +       bl      __restore_debug
>> +1:
>>         restore_host_regs
>>
>>         mov     x0, x1
>>
>> This ensures that the guest's debug state will only be reloaded if:
>>
>> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
>> using the debug infrastructure)
>> - or the guest has written to a trapped register (which marks the state
>> as dirty).
>
> Do we also handle the case where the guest didn't write to the trapped
> register but userspace did (via the SET_ONE_REG API)? Maybe this just
> falls out in the wash or is handled already...

This is pretty much handled by the same code:

- Userspace wrote to any register but MDSCR_EL1, and MDSCR_EL1 doesn't
have MDE/KDE set. In this case, we don't need to do anything, as the new
state is not in use yet.
- Userspace has written to MDSCR_EL1.{MDE,KDE}, and this indicates we
must restore the state.

Compared to what the guest does, we don't flag the state as dirty when
we write to any of the debug registers (only MDSCR_EL1 can be used to
enter the "dirty" state). It is not really a problem, as this is only a
perfermance optimisation (as soon as the guest starts using debug
registers, we want to disable trapping).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/9] arm64: KVM: debug infrastructure support
@ 2014-05-19 12:59         ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 19 2014 at  1:32:28 pm BST, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 19 May 2014 10:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> If you look carefully at patch #8 (last hunk of the patch), you'll see
>> that I always reset the debug state to "clean" at the end of a guest
>> run:
>>
>> @@ -609,6 +1040,12 @@ __kvm_vcpu_return:
>>
>>         bl __restore_sysregs
>>         bl __restore_fpsimd
>> +
>> +       skip_clean_debug_state x3, 1f
>> +       // Clear the dirty flag for the next run
>> +       str     xzr, [x0, #VCPU_DEBUG_FLAGS]
>> +       bl      __restore_debug
>> +1:
>>         restore_host_regs
>>
>>         mov     x0, x1
>>
>> This ensures that the guest's debug state will only be reloaded if:
>>
>> - MDSCR_EL1 has either MDE or KDE set (which means the guest is actively
>> using the debug infrastructure)
>> - or the guest has written to a trapped register (which marks the state
>> as dirty).
>
> Do we also handle the case where the guest didn't write to the trapped
> register but userspace did (via the SET_ONE_REG API)? Maybe this just
> falls out in the wash or is handled already...

This is pretty much handled by the same code:

- Userspace wrote to any register but MDSCR_EL1, and MDSCR_EL1 doesn't
have MDE/KDE set. In this case, we don't need to do anything, as the new
state is not in use yet.
- Userspace has written to MDSCR_EL1.{MDE,KDE}, and this indicates we
must restore the state.

Compared to what the guest does, we don't flag the state as dirty when
we write to any of the debug registers (only MDSCR_EL1 can be used to
enter the "dirty" state). It is not really a problem, as this is only a
perfermance optimisation (as soon as the guest starts using debug
registers, we want to disable trapping).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
  2014-05-19  8:38     ` Anup Patel
@ 2014-05-19 16:01       ` Marc Zyngier
  -1 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 16:01 UTC (permalink / raw)
  To: Anup Patel
  Cc: kvmarm, linux-arm-kernel, kvm, Catalin Marinas, Will Deacon,
	Ian Campbell, Christoffer Dall

On 19/05/14 09:38, Anup Patel wrote:
> On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Implement switching of the debug registers. While the number
>> of registers is massive, CPUs usually don't implement them all
>> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
>> of 22 registers "only").
>>
>> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
>> or when we've flagged the debug registers as dirty. It means that
>> most of the time, we only save/restore MDSCR_EL1.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kernel/asm-offsets.c |   1 +
>>  arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 444 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index 646f888..ae73a83 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -120,6 +120,7 @@ int main(void)
>>    DEFINE(VCPU_ESR_EL2,         offsetof(struct kvm_vcpu, arch.fault.esr_el2));
>>    DEFINE(VCPU_FAR_EL2,         offsetof(struct kvm_vcpu, arch.fault.far_el2));
>>    DEFINE(VCPU_HPFAR_EL2,       offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
>> +  DEFINE(VCPU_DEBUG_FLAGS,     offsetof(struct kvm_vcpu, arch.debug_flags));
>>    DEFINE(VCPU_HCR_EL2,         offsetof(struct kvm_vcpu, arch.hcr_el2));
>>    DEFINE(VCPU_IRQ_LINES,       offsetof(struct kvm_vcpu, arch.irq_lines));
>>    DEFINE(VCPU_HOST_CONTEXT,    offsetof(struct kvm_vcpu, arch.host_cpu_context));
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index 2c56012..f9d5a1d 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -21,6 +21,7 @@
>>  #include <asm/assembler.h>
>>  #include <asm/memory.h>
>>  #include <asm/asm-offsets.h>
>> +#include <asm/debug-monitors.h>
>>  #include <asm/fpsimdmacros.h>
>>  #include <asm/kvm.h>
>>  #include <asm/kvm_asm.h>
>> @@ -215,6 +216,7 @@ __kvm_hyp_code_start:
>>         mrs     x22,    amair_el1
>>         mrs     x23,    cntkctl_el1
>>         mrs     x24,    par_el1
>> +       mrs     x25,    mdscr_el1
>>
>>         stp     x4, x5, [x3]
>>         stp     x6, x7, [x3, #16]
>> @@ -226,7 +228,202 @@ __kvm_hyp_code_start:
>>         stp     x18, x19, [x3, #112]
>>         stp     x20, x21, [x3, #128]
>>         stp     x22, x23, [x3, #144]
>> -       str     x24, [x3, #160]
>> +       stp     x24, x25, [x3, #160]
>> +.endm
>> +
>> +.macro save_debug
>> +       // x2: base address for cpu context
>> +       // x3: tmp register
>> +
>> +       mrs     x26, id_aa64dfr0_el1
>> +       ubfx    x24, x26, #12, #4       // Extract BRPs
>> +       ubfx    x25, x26, #20, #4       // Extract WRPs
>> +       mov     w26, #15
>> +       sub     w24, w26, w24           // How many BPs to skip
>> +       sub     w25, w26, w25           // How many WPs to skip
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgbcr15_el1
>> +       mrs     x19, dbgbcr14_el1
>> +       mrs     x18, dbgbcr13_el1
>> +       mrs     x17, dbgbcr12_el1
>> +       mrs     x16, dbgbcr11_el1
>> +       mrs     x15, dbgbcr10_el1
>> +       mrs     x14, dbgbcr9_el1
>> +       mrs     x13, dbgbcr8_el1
>> +       mrs     x12, dbgbcr7_el1
>> +       mrs     x11, dbgbcr6_el1
>> +       mrs     x10, dbgbcr5_el1
>> +       mrs     x9, dbgbcr4_el1
>> +       mrs     x8, dbgbcr3_el1
>> +       mrs     x7, dbgbcr2_el1
>> +       mrs     x6, dbgbcr1_el1
>> +       mrs     x5, dbgbcr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgbvr15_el1
>> +       mrs     x19, dbgbvr14_el1
>> +       mrs     x18, dbgbvr13_el1
>> +       mrs     x17, dbgbvr12_el1
>> +       mrs     x16, dbgbvr11_el1
>> +       mrs     x15, dbgbvr10_el1
>> +       mrs     x14, dbgbvr9_el1
>> +       mrs     x13, dbgbvr8_el1
>> +       mrs     x12, dbgbvr7_el1
>> +       mrs     x11, dbgbvr6_el1
>> +       mrs     x10, dbgbvr5_el1
>> +       mrs     x9, dbgbvr4_el1
>> +       mrs     x8, dbgbvr3_el1
>> +       mrs     x7, dbgbvr2_el1
>> +       mrs     x6, dbgbvr1_el1
>> +       mrs     x5, dbgbvr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgwcr15_el1
>> +       mrs     x19, dbgwcr14_el1
>> +       mrs     x18, dbgwcr13_el1
>> +       mrs     x17, dbgwcr12_el1
>> +       mrs     x16, dbgwcr11_el1
>> +       mrs     x15, dbgwcr10_el1
>> +       mrs     x14, dbgwcr9_el1
>> +       mrs     x13, dbgwcr8_el1
>> +       mrs     x12, dbgwcr7_el1
>> +       mrs     x11, dbgwcr6_el1
>> +       mrs     x10, dbgwcr5_el1
>> +       mrs     x9, dbgwcr4_el1
>> +       mrs     x8, dbgwcr3_el1
>> +       mrs     x7, dbgwcr2_el1
>> +       mrs     x6, dbgwcr1_el1
>> +       mrs     x5, dbgwcr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgwvr15_el1
>> +       mrs     x19, dbgwvr14_el1
>> +       mrs     x18, dbgwvr13_el1
>> +       mrs     x17, dbgwvr12_el1
>> +       mrs     x16, dbgwvr11_el1
>> +       mrs     x15, dbgwvr10_el1
>> +       mrs     x14, dbgwvr9_el1
>> +       mrs     x13, dbgwvr8_el1
>> +       mrs     x12, dbgwvr7_el1
>> +       mrs     x11, dbgwvr6_el1
>> +       mrs     x10, dbgwvr5_el1
>> +       mrs     x9, dbgwvr4_el1
>> +       mrs     x8, dbgwvr3_el1
>> +       mrs     x7, dbgwvr2_el1
>> +       mrs     x6, dbgwvr1_el1
>> +       mrs     x5, dbgwvr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       mrs     x21, mdccint_el1
>> +       str     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>>  .endm
>>
>>  .macro restore_sysregs
>> @@ -245,7 +442,7 @@ __kvm_hyp_code_start:
>>         ldp     x18, x19, [x3, #112]
>>         ldp     x20, x21, [x3, #128]
>>         ldp     x22, x23, [x3, #144]
>> -       ldr     x24, [x3, #160]
>> +       ldp     x24, x25, [x3, #160]
>>
>>         msr     vmpidr_el2,     x4
>>         msr     csselr_el1,     x5
>> @@ -268,6 +465,198 @@ __kvm_hyp_code_start:
>>         msr     amair_el1,      x22
>>         msr     cntkctl_el1,    x23
>>         msr     par_el1,        x24
>> +       msr     mdscr_el1,      x25
>> +.endm
>> +
>> +.macro restore_debug
>> +       // x2: base address for cpu context
>> +       // x3: tmp register
>> +
>> +       mrs     x26, id_aa64dfr0_el1
>> +       ubfx    x24, x26, #12, #4       // Extract BRPs
>> +       ubfx    x25, x26, #20, #4       // Extract WRPs
>> +       mov     w26, #15
>> +       sub     w24, w26, w24           // How many BPs to skip
>> +       sub     w25, w26, w25           // How many WPs to skip
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgbcr15_el1, x20
>> +       msr     dbgbcr14_el1, x19
>> +       msr     dbgbcr13_el1, x18
>> +       msr     dbgbcr12_el1, x17
>> +       msr     dbgbcr11_el1, x16
>> +       msr     dbgbcr10_el1, x15
>> +       msr     dbgbcr9_el1, x14
>> +       msr     dbgbcr8_el1, x13
>> +       msr     dbgbcr7_el1, x12
>> +       msr     dbgbcr6_el1, x11
>> +       msr     dbgbcr5_el1, x10
>> +       msr     dbgbcr4_el1, x9
>> +       msr     dbgbcr3_el1, x8
>> +       msr     dbgbcr2_el1, x7
>> +       msr     dbgbcr1_el1, x6
>> +       msr     dbgbcr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgbvr15_el1, x20
>> +       msr     dbgbvr14_el1, x19
>> +       msr     dbgbvr13_el1, x18
>> +       msr     dbgbvr12_el1, x17
>> +       msr     dbgbvr11_el1, x16
>> +       msr     dbgbvr10_el1, x15
>> +       msr     dbgbvr9_el1, x14
>> +       msr     dbgbvr8_el1, x13
>> +       msr     dbgbvr7_el1, x12
>> +       msr     dbgbvr6_el1, x11
>> +       msr     dbgbvr5_el1, x10
>> +       msr     dbgbvr4_el1, x9
>> +       msr     dbgbvr3_el1, x8
>> +       msr     dbgbvr2_el1, x7
>> +       msr     dbgbvr1_el1, x6
>> +       msr     dbgbvr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgwcr15_el1, x20
>> +       msr     dbgwcr14_el1, x19
>> +       msr     dbgwcr13_el1, x18
>> +       msr     dbgwcr12_el1, x17
>> +       msr     dbgwcr11_el1, x16
>> +       msr     dbgwcr10_el1, x15
>> +       msr     dbgwcr9_el1, x14
>> +       msr     dbgwcr8_el1, x13
>> +       msr     dbgwcr7_el1, x12
>> +       msr     dbgwcr6_el1, x11
>> +       msr     dbgwcr5_el1, x10
>> +       msr     dbgwcr4_el1, x9
>> +       msr     dbgwcr3_el1, x8
>> +       msr     dbgwcr2_el1, x7
>> +       msr     dbgwcr1_el1, x6
>> +       msr     dbgwcr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgwvr15_el1, x20
>> +       msr     dbgwvr14_el1, x19
>> +       msr     dbgwvr13_el1, x18
>> +       msr     dbgwvr12_el1, x17
>> +       msr     dbgwvr11_el1, x16
>> +       msr     dbgwvr10_el1, x15
>> +       msr     dbgwvr9_el1, x14
>> +       msr     dbgwvr8_el1, x13
>> +       msr     dbgwvr7_el1, x12
>> +       msr     dbgwvr6_el1, x11
>> +       msr     dbgwvr5_el1, x10
>> +       msr     dbgwvr4_el1, x9
>> +       msr     dbgwvr3_el1, x8
>> +       msr     dbgwvr2_el1, x7
>> +       msr     dbgwvr1_el1, x6
>> +       msr     dbgwvr0_el1, x5
>> +
>> +       ldr     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>> +       msr     mdccint_el1, x21
>>  .endm
>>
>>  .macro skip_32bit_state tmp, target
>> @@ -282,6 +671,11 @@ __kvm_hyp_code_start:
>>         tbz     \tmp, #12, \target
>>  .endm
>>
>> +.macro skip_clean_debug_state tmp, target
>> +       ldr     \tmp, [x0, #VCPU_DEBUG_FLAGS]
>> +       tbz     \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
>> +.endm
>> +
> 
> At first glance, I was little confused by the macro name
> skip_clean_debug_state.
> 
> I suggest to rename this macro to something like:
> jump_to_label_if_debug_clean.

Well, we already have:
- skip_32bit_state
- skip_tee_state

Both mean "skip xxx state if not not useful in the vcpu context".
Maybe simplifying it to just read "skip_debug_state"?

Christoffer, you're much better at naming stuff than I am. Any idea?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 8/9] arm64: KVM: implement lazy world switch for debug registers
@ 2014-05-19 16:01       ` Marc Zyngier
  0 siblings, 0 replies; 60+ messages in thread
From: Marc Zyngier @ 2014-05-19 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/05/14 09:38, Anup Patel wrote:
> On 7 May 2014 20:50, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Implement switching of the debug registers. While the number
>> of registers is massive, CPUs usually don't implement them all
>> (A57 has 6 breakpoints and 4 watchpoints, which gives us a total
>> of 22 registers "only").
>>
>> Also, we only save/restore them when MDSCR_EL1 has debug enabled,
>> or when we've flagged the debug registers as dirty. It means that
>> most of the time, we only save/restore MDSCR_EL1.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kernel/asm-offsets.c |   1 +
>>  arch/arm64/kvm/hyp.S            | 449 +++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 444 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index 646f888..ae73a83 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -120,6 +120,7 @@ int main(void)
>>    DEFINE(VCPU_ESR_EL2,         offsetof(struct kvm_vcpu, arch.fault.esr_el2));
>>    DEFINE(VCPU_FAR_EL2,         offsetof(struct kvm_vcpu, arch.fault.far_el2));
>>    DEFINE(VCPU_HPFAR_EL2,       offsetof(struct kvm_vcpu, arch.fault.hpfar_el2));
>> +  DEFINE(VCPU_DEBUG_FLAGS,     offsetof(struct kvm_vcpu, arch.debug_flags));
>>    DEFINE(VCPU_HCR_EL2,         offsetof(struct kvm_vcpu, arch.hcr_el2));
>>    DEFINE(VCPU_IRQ_LINES,       offsetof(struct kvm_vcpu, arch.irq_lines));
>>    DEFINE(VCPU_HOST_CONTEXT,    offsetof(struct kvm_vcpu, arch.host_cpu_context));
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index 2c56012..f9d5a1d 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -21,6 +21,7 @@
>>  #include <asm/assembler.h>
>>  #include <asm/memory.h>
>>  #include <asm/asm-offsets.h>
>> +#include <asm/debug-monitors.h>
>>  #include <asm/fpsimdmacros.h>
>>  #include <asm/kvm.h>
>>  #include <asm/kvm_asm.h>
>> @@ -215,6 +216,7 @@ __kvm_hyp_code_start:
>>         mrs     x22,    amair_el1
>>         mrs     x23,    cntkctl_el1
>>         mrs     x24,    par_el1
>> +       mrs     x25,    mdscr_el1
>>
>>         stp     x4, x5, [x3]
>>         stp     x6, x7, [x3, #16]
>> @@ -226,7 +228,202 @@ __kvm_hyp_code_start:
>>         stp     x18, x19, [x3, #112]
>>         stp     x20, x21, [x3, #128]
>>         stp     x22, x23, [x3, #144]
>> -       str     x24, [x3, #160]
>> +       stp     x24, x25, [x3, #160]
>> +.endm
>> +
>> +.macro save_debug
>> +       // x2: base address for cpu context
>> +       // x3: tmp register
>> +
>> +       mrs     x26, id_aa64dfr0_el1
>> +       ubfx    x24, x26, #12, #4       // Extract BRPs
>> +       ubfx    x25, x26, #20, #4       // Extract WRPs
>> +       mov     w26, #15
>> +       sub     w24, w26, w24           // How many BPs to skip
>> +       sub     w25, w26, w25           // How many WPs to skip
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgbcr15_el1
>> +       mrs     x19, dbgbcr14_el1
>> +       mrs     x18, dbgbcr13_el1
>> +       mrs     x17, dbgbcr12_el1
>> +       mrs     x16, dbgbcr11_el1
>> +       mrs     x15, dbgbcr10_el1
>> +       mrs     x14, dbgbcr9_el1
>> +       mrs     x13, dbgbcr8_el1
>> +       mrs     x12, dbgbcr7_el1
>> +       mrs     x11, dbgbcr6_el1
>> +       mrs     x10, dbgbcr5_el1
>> +       mrs     x9, dbgbcr4_el1
>> +       mrs     x8, dbgbcr3_el1
>> +       mrs     x7, dbgbcr2_el1
>> +       mrs     x6, dbgbcr1_el1
>> +       mrs     x5, dbgbcr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgbvr15_el1
>> +       mrs     x19, dbgbvr14_el1
>> +       mrs     x18, dbgbvr13_el1
>> +       mrs     x17, dbgbvr12_el1
>> +       mrs     x16, dbgbvr11_el1
>> +       mrs     x15, dbgbvr10_el1
>> +       mrs     x14, dbgbvr9_el1
>> +       mrs     x13, dbgbvr8_el1
>> +       mrs     x12, dbgbvr7_el1
>> +       mrs     x11, dbgbvr6_el1
>> +       mrs     x10, dbgbvr5_el1
>> +       mrs     x9, dbgbvr4_el1
>> +       mrs     x8, dbgbvr3_el1
>> +       mrs     x7, dbgbvr2_el1
>> +       mrs     x6, dbgbvr1_el1
>> +       mrs     x5, dbgbvr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgwcr15_el1
>> +       mrs     x19, dbgwcr14_el1
>> +       mrs     x18, dbgwcr13_el1
>> +       mrs     x17, dbgwcr12_el1
>> +       mrs     x16, dbgwcr11_el1
>> +       mrs     x15, dbgwcr10_el1
>> +       mrs     x14, dbgwcr9_el1
>> +       mrs     x13, dbgwcr8_el1
>> +       mrs     x12, dbgwcr7_el1
>> +       mrs     x11, dbgwcr6_el1
>> +       mrs     x10, dbgwcr5_el1
>> +       mrs     x9, dbgwcr4_el1
>> +       mrs     x8, dbgwcr3_el1
>> +       mrs     x7, dbgwcr2_el1
>> +       mrs     x6, dbgwcr1_el1
>> +       mrs     x5, dbgwcr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       mrs     x20, dbgwvr15_el1
>> +       mrs     x19, dbgwvr14_el1
>> +       mrs     x18, dbgwvr13_el1
>> +       mrs     x17, dbgwvr12_el1
>> +       mrs     x16, dbgwvr11_el1
>> +       mrs     x15, dbgwvr10_el1
>> +       mrs     x14, dbgwvr9_el1
>> +       mrs     x13, dbgwvr8_el1
>> +       mrs     x12, dbgwvr7_el1
>> +       mrs     x11, dbgwvr6_el1
>> +       mrs     x10, dbgwvr5_el1
>> +       mrs     x9, dbgwvr4_el1
>> +       mrs     x8, dbgwvr3_el1
>> +       mrs     x7, dbgwvr2_el1
>> +       mrs     x6, dbgwvr1_el1
>> +       mrs     x5, dbgwvr0_el1
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +
>> +1:
>> +       str     x20, [x3, #(15 * 8)]
>> +       str     x19, [x3, #(14 * 8)]
>> +       str     x18, [x3, #(13 * 8)]
>> +       str     x17, [x3, #(12 * 8)]
>> +       str     x16, [x3, #(11 * 8)]
>> +       str     x15, [x3, #(10 * 8)]
>> +       str     x14, [x3, #(9 * 8)]
>> +       str     x13, [x3, #(8 * 8)]
>> +       str     x12, [x3, #(7 * 8)]
>> +       str     x11, [x3, #(6 * 8)]
>> +       str     x10, [x3, #(5 * 8)]
>> +       str     x9, [x3, #(4 * 8)]
>> +       str     x8, [x3, #(3 * 8)]
>> +       str     x7, [x3, #(2 * 8)]
>> +       str     x6, [x3, #(1 * 8)]
>> +       str     x5, [x3, #(0 * 8)]
>> +
>> +       mrs     x21, mdccint_el1
>> +       str     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>>  .endm
>>
>>  .macro restore_sysregs
>> @@ -245,7 +442,7 @@ __kvm_hyp_code_start:
>>         ldp     x18, x19, [x3, #112]
>>         ldp     x20, x21, [x3, #128]
>>         ldp     x22, x23, [x3, #144]
>> -       ldr     x24, [x3, #160]
>> +       ldp     x24, x25, [x3, #160]
>>
>>         msr     vmpidr_el2,     x4
>>         msr     csselr_el1,     x5
>> @@ -268,6 +465,198 @@ __kvm_hyp_code_start:
>>         msr     amair_el1,      x22
>>         msr     cntkctl_el1,    x23
>>         msr     par_el1,        x24
>> +       msr     mdscr_el1,      x25
>> +.endm
>> +
>> +.macro restore_debug
>> +       // x2: base address for cpu context
>> +       // x3: tmp register
>> +
>> +       mrs     x26, id_aa64dfr0_el1
>> +       ubfx    x24, x26, #12, #4       // Extract BRPs
>> +       ubfx    x25, x26, #20, #4       // Extract WRPs
>> +       mov     w26, #15
>> +       sub     w24, w26, w24           // How many BPs to skip
>> +       sub     w25, w26, w25           // How many WPs to skip
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgbcr15_el1, x20
>> +       msr     dbgbcr14_el1, x19
>> +       msr     dbgbcr13_el1, x18
>> +       msr     dbgbcr12_el1, x17
>> +       msr     dbgbcr11_el1, x16
>> +       msr     dbgbcr10_el1, x15
>> +       msr     dbgbcr9_el1, x14
>> +       msr     dbgbcr8_el1, x13
>> +       msr     dbgbcr7_el1, x12
>> +       msr     dbgbcr6_el1, x11
>> +       msr     dbgbcr5_el1, x10
>> +       msr     dbgbcr4_el1, x9
>> +       msr     dbgbcr3_el1, x8
>> +       msr     dbgbcr2_el1, x7
>> +       msr     dbgbcr1_el1, x6
>> +       msr     dbgbcr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x24, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgbvr15_el1, x20
>> +       msr     dbgbvr14_el1, x19
>> +       msr     dbgbvr13_el1, x18
>> +       msr     dbgbvr12_el1, x17
>> +       msr     dbgbvr11_el1, x16
>> +       msr     dbgbvr10_el1, x15
>> +       msr     dbgbvr9_el1, x14
>> +       msr     dbgbvr8_el1, x13
>> +       msr     dbgbvr7_el1, x12
>> +       msr     dbgbvr6_el1, x11
>> +       msr     dbgbvr5_el1, x10
>> +       msr     dbgbvr4_el1, x9
>> +       msr     dbgbvr3_el1, x8
>> +       msr     dbgbvr2_el1, x7
>> +       msr     dbgbvr1_el1, x6
>> +       msr     dbgbvr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgwcr15_el1, x20
>> +       msr     dbgwcr14_el1, x19
>> +       msr     dbgwcr13_el1, x18
>> +       msr     dbgwcr12_el1, x17
>> +       msr     dbgwcr11_el1, x16
>> +       msr     dbgwcr10_el1, x15
>> +       msr     dbgwcr9_el1, x14
>> +       msr     dbgwcr8_el1, x13
>> +       msr     dbgwcr7_el1, x12
>> +       msr     dbgwcr6_el1, x11
>> +       msr     dbgwcr5_el1, x10
>> +       msr     dbgwcr4_el1, x9
>> +       msr     dbgwcr3_el1, x8
>> +       msr     dbgwcr2_el1, x7
>> +       msr     dbgwcr1_el1, x6
>> +       msr     dbgwcr0_el1, x5
>> +
>> +       add     x3, x2, #CPU_SYSREG_OFFSET(DBGWVR0_EL1)
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       ldr     x20, [x3, #(15 * 8)]
>> +       ldr     x19, [x3, #(14 * 8)]
>> +       ldr     x18, [x3, #(13 * 8)]
>> +       ldr     x17, [x3, #(12 * 8)]
>> +       ldr     x16, [x3, #(11 * 8)]
>> +       ldr     x15, [x3, #(10 * 8)]
>> +       ldr     x14, [x3, #(9 * 8)]
>> +       ldr     x13, [x3, #(8 * 8)]
>> +       ldr     x12, [x3, #(7 * 8)]
>> +       ldr     x11, [x3, #(6 * 8)]
>> +       ldr     x10, [x3, #(5 * 8)]
>> +       ldr     x9, [x3, #(4 * 8)]
>> +       ldr     x8, [x3, #(3 * 8)]
>> +       ldr     x7, [x3, #(2 * 8)]
>> +       ldr     x6, [x3, #(1 * 8)]
>> +       ldr     x5, [x3, #(0 * 8)]
>> +
>> +       adr     x26, 1f
>> +       add     x26, x26, x25, lsl #2
>> +       br      x26
>> +1:
>> +       msr     dbgwvr15_el1, x20
>> +       msr     dbgwvr14_el1, x19
>> +       msr     dbgwvr13_el1, x18
>> +       msr     dbgwvr12_el1, x17
>> +       msr     dbgwvr11_el1, x16
>> +       msr     dbgwvr10_el1, x15
>> +       msr     dbgwvr9_el1, x14
>> +       msr     dbgwvr8_el1, x13
>> +       msr     dbgwvr7_el1, x12
>> +       msr     dbgwvr6_el1, x11
>> +       msr     dbgwvr5_el1, x10
>> +       msr     dbgwvr4_el1, x9
>> +       msr     dbgwvr3_el1, x8
>> +       msr     dbgwvr2_el1, x7
>> +       msr     dbgwvr1_el1, x6
>> +       msr     dbgwvr0_el1, x5
>> +
>> +       ldr     x21, [x2, #CPU_SYSREG_OFFSET(MDCCINT_EL1)]
>> +       msr     mdccint_el1, x21
>>  .endm
>>
>>  .macro skip_32bit_state tmp, target
>> @@ -282,6 +671,11 @@ __kvm_hyp_code_start:
>>         tbz     \tmp, #12, \target
>>  .endm
>>
>> +.macro skip_clean_debug_state tmp, target
>> +       ldr     \tmp, [x0, #VCPU_DEBUG_FLAGS]
>> +       tbz     \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
>> +.endm
>> +
> 
> At first glance, I was little confused by the macro name
> skip_clean_debug_state.
> 
> I suggest to rename this macro to something like:
> jump_to_label_if_debug_clean.

Well, we already have:
- skip_32bit_state
- skip_tee_state

Both mean "skip xxx state if not not useful in the vcpu context".
Maybe simplifying it to just read "skip_debug_state"?

Christoffer, you're much better at naming stuff than I am. Any idea?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2014-05-19 16:01 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07 15:20 [PATCH 0/9] arm64: KVM: debug infrastructure support Marc Zyngier
2014-05-07 15:20 ` Marc Zyngier
2014-05-07 15:20 ` [PATCH 1/9] arm64: KVM: rename pm_fake handler to trap_wi_raz Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-07 15:34   ` Peter Maydell
2014-05-07 15:34     ` Peter Maydell
2014-05-07 15:42     ` Marc Zyngier
2014-05-07 15:42       ` Marc Zyngier
2014-05-19  8:43   ` Anup Patel
2014-05-19  8:43     ` Anup Patel
2014-05-07 15:20 ` [PATCH 2/9] arm64: move DBG_MDSCR_* to asm/debug-monitors.h Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-07 17:14   ` Will Deacon
2014-05-07 17:14     ` Will Deacon
2014-05-07 15:20 ` [PATCH 3/9] arm64: KVM: add trap handlers for AArch64 debug registers Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:27   ` Anup Patel
2014-05-19  8:27     ` Anup Patel
2014-05-07 15:20 ` [PATCH 4/9] arm64: KVM: common infrastructure for handling AArch32 CP14/CP15 Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:29   ` Anup Patel
2014-05-19  8:29     ` Anup Patel
2014-05-07 15:20 ` [PATCH 5/9] arm64: KVM: use separate tables for AArch32 32 and 64bit traps Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:29   ` Anup Patel
2014-05-19  8:29     ` Anup Patel
2014-05-07 15:20 ` [PATCH 6/9] arm64: KVM: check ordering of all system register tables Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:31   ` Anup Patel
2014-05-19  8:31     ` Anup Patel
2014-05-07 15:20 ` [PATCH 7/9] arm64: KVM: add trap handlers for AArch32 debug registers Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:33   ` Anup Patel
2014-05-19  8:33     ` Anup Patel
2014-05-07 15:20 ` [PATCH 8/9] arm64: KVM: implement lazy world switch for " Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:38   ` Anup Patel
2014-05-19  8:38     ` Anup Patel
2014-05-19 16:01     ` Marc Zyngier
2014-05-19 16:01       ` Marc Zyngier
2014-05-07 15:20 ` [PATCH 9/9] arm64: KVM: enable trapping of all " Marc Zyngier
2014-05-07 15:20   ` Marc Zyngier
2014-05-19  8:40   ` Anup Patel
2014-05-19  8:40     ` Anup Patel
2014-05-07 15:42 ` [PATCH 0/9] arm64: KVM: debug infrastructure support Peter Maydell
2014-05-07 15:42   ` Peter Maydell
2014-05-07 15:57   ` Marc Zyngier
2014-05-07 15:57     ` Marc Zyngier
2014-05-19  9:05 ` Anup Patel
2014-05-19  9:05   ` Anup Patel
2014-05-19  9:28   ` Marc Zyngier
2014-05-19  9:28     ` Marc Zyngier
2014-05-19  9:35     ` Anup Patel
2014-05-19  9:35       ` Anup Patel
2014-05-19 12:22       ` Marc Zyngier
2014-05-19 12:22         ` Marc Zyngier
2014-05-19 12:32     ` Peter Maydell
2014-05-19 12:32       ` Peter Maydell
2014-05-19 12:59       ` Marc Zyngier
2014-05-19 12:59         ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.