All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] KVM: arm: debug infrastructure support
@ 2015-06-22 10:41 ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

This patch series adds debug support, a key feature missing from the
KVM/armv7 port.

The main idea is borrowed from ARM64, which is to keep track of whether 
the debug registers are "dirty" (changed by the guest) or not. In this 
case, perform the usual save/restore dance, for one run only. It means 
we only have a penalty if a guest is actively using the debug registers.

The amount of registers is properly frightening, but CPUs actually
only implement a subset of them. Also, there is a number of registers
we don't bother emulating (things having to do with external debug and
OSlock).

External debug is when you actually plug a physical JTAG into the CPU.
OSlock is a way to prevent "other software" to play with the debug
registers. My understanding is that it is only useful in combination
with the external debug. In both case, implementing support for this
is probably not worth the effort, at least for the time being.

This has been tested on a Cortex-A15 platform, running 32bit guests.

The patches for this series are based off v4.1-rc8 and can be found
at:

https://git.linaro.org/people/zhichao.huang/linux.git
branch: guest-debug/4.1-rc8-v3

>From v2 [2]:
- Delete the debug mode enabling/disabling strategy
- Add missing cp14/cp15 trace events

>From v1 [1]:
- Added missing cp14 reset functions
- Disable debug mode if we don't need it to reduce unnecessary switch

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-May/014729.html
[2]: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-May/014847.html

Zhichao Huang (11):
  KVM: arm: plug guest debug exploit
  KVM: arm: rename pm_fake handler to trap_raz_wi
  KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly
    code
  KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  KVM: arm: check ordering of all system register tables
  KVM: arm: add trap handlers for 32-bit debug registers
  KVM: arm: add trap handlers for 64-bit debug registers
  KVM: arm: implement dirty bit mechanism for debug registers
  KVM: arm: implement lazy world switch for debug registers
  KVM: arm: add a trace event for cp14 traps
  KVM: arm: enable trapping of all debug registers

 arch/arm/include/asm/hw_breakpoint.h |  54 ++---
 arch/arm/include/asm/kvm_asm.h       |  15 ++
 arch/arm/include/asm/kvm_coproc.h    |   3 +-
 arch/arm/include/asm/kvm_host.h      |   6 +
 arch/arm/kernel/asm-offsets.c        |   2 +
 arch/arm/kvm/coproc.c                | 407 ++++++++++++++++++++++++++++++-----
 arch/arm/kvm/handle_exit.c           |   4 +-
 arch/arm/kvm/interrupts.S            |  16 ++
 arch/arm/kvm/interrupts_head.S       | 313 ++++++++++++++++++++++++++-
 arch/arm/kvm/trace.h                 |  30 +++
 10 files changed, 762 insertions(+), 88 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 00/11] KVM: arm: debug infrastructure support
@ 2015-06-22 10:41 ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

This patch series adds debug support, a key feature missing from the
KVM/armv7 port.

The main idea is borrowed from ARM64, which is to keep track of whether 
the debug registers are "dirty" (changed by the guest) or not. In this 
case, perform the usual save/restore dance, for one run only. It means 
we only have a penalty if a guest is actively using the debug registers.

The amount of registers is properly frightening, but CPUs actually
only implement a subset of them. Also, there is a number of registers
we don't bother emulating (things having to do with external debug and
OSlock).

External debug is when you actually plug a physical JTAG into the CPU.
OSlock is a way to prevent "other software" to play with the debug
registers. My understanding is that it is only useful in combination
with the external debug. In both case, implementing support for this
is probably not worth the effort, at least for the time being.

This has been tested on a Cortex-A15 platform, running 32bit guests.

The patches for this series are based off v4.1-rc8 and can be found
at:

https://git.linaro.org/people/zhichao.huang/linux.git
branch: guest-debug/4.1-rc8-v3

>From v2 [2]:
- Delete the debug mode enabling/disabling strategy
- Add missing cp14/cp15 trace events

>From v1 [1]:
- Added missing cp14 reset functions
- Disable debug mode if we don't need it to reduce unnecessary switch

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-May/014729.html
[2]: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-May/014847.html

Zhichao Huang (11):
  KVM: arm: plug guest debug exploit
  KVM: arm: rename pm_fake handler to trap_raz_wi
  KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly
    code
  KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  KVM: arm: check ordering of all system register tables
  KVM: arm: add trap handlers for 32-bit debug registers
  KVM: arm: add trap handlers for 64-bit debug registers
  KVM: arm: implement dirty bit mechanism for debug registers
  KVM: arm: implement lazy world switch for debug registers
  KVM: arm: add a trace event for cp14 traps
  KVM: arm: enable trapping of all debug registers

 arch/arm/include/asm/hw_breakpoint.h |  54 ++---
 arch/arm/include/asm/kvm_asm.h       |  15 ++
 arch/arm/include/asm/kvm_coproc.h    |   3 +-
 arch/arm/include/asm/kvm_host.h      |   6 +
 arch/arm/kernel/asm-offsets.c        |   2 +
 arch/arm/kvm/coproc.c                | 407 ++++++++++++++++++++++++++++++-----
 arch/arm/kvm/handle_exit.c           |   4 +-
 arch/arm/kvm/interrupts.S            |  16 ++
 arch/arm/kvm/interrupts_head.S       | 313 ++++++++++++++++++++++++++-
 arch/arm/kvm/trace.h                 |  30 +++
 10 files changed, 762 insertions(+), 88 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
  2015-06-22 10:41 ` Zhichao Huang
  (?)
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang, stable

Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.

This patch enable trapping of all debug registers, preventing the guests
to access the debug registers.

This patch also disable the debug mode(DBGDSCR) in the guest world all
the time, preventing the guests to mess with the host state.

However, it is a precursor for later patches which will need to do
more to world switch debug states while necessary.

Cc: <stable@vger.kernel.org>
Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_coproc.h |  3 +-
 arch/arm/kvm/coproc.c             | 60 +++++++++++++++++++++++++++++++++++----
 arch/arm/kvm/handle_exit.c        |  4 +--
 arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
 4 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h
index 4917c2f..e74ab0f 100644
--- a/arch/arm/include/asm/kvm_coproc.h
+++ b/arch/arm/include/asm/kvm_coproc.h
@@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);
 int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index f3d88dc..2e12760 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
 static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)
 {
 	/*
@@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return emulate_cp15(vcpu, &params);
 }
 
+/**
+ * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = true;
+
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
+	params.Op2 = 0;
+	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.CRm = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
+/**
+ * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = false;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
+	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
+	params.Rt2 = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
 /******************************************************************************
  * Userspace API
  *****************************************************************************/
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 95f12b2..357ad1b 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[HSR_EC_WFI]		= kvm_handle_wfx,
 	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
 	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
-	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
 	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
 	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
 	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 35e4a3a..f85c447 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mrs	r8, LR_fiq
 	mrs	r9, SPSR_fiq
 	push	{r2-r9}
+
+	/* DBGDSCR reg */
+	mrc	p14, 0, r2, c0, c1, 0
+	push	{r2}
 .endm
 
 .macro pop_host_regs_mode mode
@@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers all registers, in all modes, except r0 and r1.
  */
 .macro restore_host_regs
+	pop	{r2}
+	mcr	p14, 0, r2, c0, c2, 2
+
 	pop	{r2-r9}
 	msr	r8_fiq, r2
 	msr	r9_fiq, r3
@@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers *all* registers.
  */
 .macro restore_guest_regs
+	/* reset DBGDSCR to disable debug mode */
+	mov	r2, #0
+	mcr	p14, 0, r2, c0, c2, 2
+
 	restore_guest_regs_mode svc, #VCPU_SVC_REGS
 	restore_guest_regs_mode abt, #VCPU_ABT_REGS
 	restore_guest_regs_mode und, #VCPU_UND_REGS
@@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
  * (hardware reset value is 0) */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe stable" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, stable, Zhichao Huang

Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.

This patch enable trapping of all debug registers, preventing the guests
to access the debug registers.

This patch also disable the debug mode(DBGDSCR) in the guest world all
the time, preventing the guests to mess with the host state.

However, it is a precursor for later patches which will need to do
more to world switch debug states while necessary.

Cc: <stable@vger.kernel.org>
Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_coproc.h |  3 +-
 arch/arm/kvm/coproc.c             | 60 +++++++++++++++++++++++++++++++++++----
 arch/arm/kvm/handle_exit.c        |  4 +--
 arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
 4 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h
index 4917c2f..e74ab0f 100644
--- a/arch/arm/include/asm/kvm_coproc.h
+++ b/arch/arm/include/asm/kvm_coproc.h
@@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);
 int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index f3d88dc..2e12760 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
 static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)
 {
 	/*
@@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return emulate_cp15(vcpu, &params);
 }
 
+/**
+ * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = true;
+
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
+	params.Op2 = 0;
+	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.CRm = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
+/**
+ * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = false;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
+	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
+	params.Rt2 = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
 /******************************************************************************
  * Userspace API
  *****************************************************************************/
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 95f12b2..357ad1b 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[HSR_EC_WFI]		= kvm_handle_wfx,
 	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
 	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
-	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
 	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
 	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
 	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 35e4a3a..f85c447 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mrs	r8, LR_fiq
 	mrs	r9, SPSR_fiq
 	push	{r2-r9}
+
+	/* DBGDSCR reg */
+	mrc	p14, 0, r2, c0, c1, 0
+	push	{r2}
 .endm
 
 .macro pop_host_regs_mode mode
@@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers all registers, in all modes, except r0 and r1.
  */
 .macro restore_host_regs
+	pop	{r2}
+	mcr	p14, 0, r2, c0, c2, 2
+
 	pop	{r2-r9}
 	msr	r8_fiq, r2
 	msr	r9_fiq, r3
@@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers *all* registers.
  */
 .macro restore_guest_regs
+	/* reset DBGDSCR to disable debug mode */
+	mov	r2, #0
+	mcr	p14, 0, r2, c0, c2, 2
+
 	restore_guest_regs_mode svc, #VCPU_SVC_REGS
 	restore_guest_regs_mode abt, #VCPU_ABT_REGS
 	restore_guest_regs_mode und, #VCPU_UND_REGS
@@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
  * (hardware reset value is 0) */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.

This patch enable trapping of all debug registers, preventing the guests
to access the debug registers.

This patch also disable the debug mode(DBGDSCR) in the guest world all
the time, preventing the guests to mess with the host state.

However, it is a precursor for later patches which will need to do
more to world switch debug states while necessary.

Cc: <stable@vger.kernel.org>
Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_coproc.h |  3 +-
 arch/arm/kvm/coproc.c             | 60 +++++++++++++++++++++++++++++++++++----
 arch/arm/kvm/handle_exit.c        |  4 +--
 arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
 4 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h
index 4917c2f..e74ab0f 100644
--- a/arch/arm/include/asm/kvm_coproc.h
+++ b/arch/arm/include/asm/kvm_coproc.h
@@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);
 int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index f3d88dc..2e12760 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
-int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
-{
-	kvm_inject_undefined(vcpu);
-	return 1;
-}
-
 static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)
 {
 	/*
@@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return emulate_cp15(vcpu, &params);
 }
 
+/**
+ * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = true;
+
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
+	params.Op2 = 0;
+	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.CRm = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
+/**
+ * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	struct coproc_params params;
+
+	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
+	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
+	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
+	params.is_64bit = false;
+
+	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
+	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
+	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
+	params.Rt2 = 0;
+
+	/* raz_wi */
+	(void)pm_fake(vcpu, &params, NULL);
+
+	/* handled */
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+	return 1;
+}
+
 /******************************************************************************
  * Userspace API
  *****************************************************************************/
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 95f12b2..357ad1b 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
 	[HSR_EC_WFI]		= kvm_handle_wfx,
 	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
 	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
-	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
 	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
-	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
+	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
 	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
 	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
 	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 35e4a3a..f85c447 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mrs	r8, LR_fiq
 	mrs	r9, SPSR_fiq
 	push	{r2-r9}
+
+	/* DBGDSCR reg */
+	mrc	p14, 0, r2, c0, c1, 0
+	push	{r2}
 .endm
 
 .macro pop_host_regs_mode mode
@@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers all registers, in all modes, except r0 and r1.
  */
 .macro restore_host_regs
+	pop	{r2}
+	mcr	p14, 0, r2, c0, c2, 2
+
 	pop	{r2-r9}
 	msr	r8_fiq, r2
 	msr	r9_fiq, r3
@@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers *all* registers.
  */
 .macro restore_guest_regs
+	/* reset DBGDSCR to disable debug mode */
+	mov	r2, #0
+	mcr	p14, 0, r2, c0, c2, 2
+
 	restore_guest_regs_mode svc, #VCPU_SVC_REGS
 	restore_guest_regs_mode abt, #VCPU_ABT_REGS
 	restore_guest_regs_mode und, #VCPU_UND_REGS
@@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
  * (hardware reset value is 0) */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 02/11] KVM: arm: rename pm_fake handler to trap_raz_wi
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: Alex Bennee <alex.bennee@linaro.org>
---
 arch/arm/kvm/coproc.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 2e12760..9d283d9 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -229,7 +229,7 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
  * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
  * all PM registers, which doesn't crash the guest kernel at least.
  */
-static bool pm_fake(struct kvm_vcpu *vcpu,
+static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 		    const struct coproc_params *p,
 		    const struct coproc_reg *r)
 {
@@ -239,19 +239,19 @@ static bool pm_fake(struct kvm_vcpu *vcpu,
 		return read_zero(vcpu, p);
 }
 
-#define access_pmcr pm_fake
-#define access_pmcntenset pm_fake
-#define access_pmcntenclr pm_fake
-#define access_pmovsr pm_fake
-#define access_pmselr pm_fake
-#define access_pmceid0 pm_fake
-#define access_pmceid1 pm_fake
-#define access_pmccntr pm_fake
-#define access_pmxevtyper pm_fake
-#define access_pmxevcntr pm_fake
-#define access_pmuserenr pm_fake
-#define access_pmintenset pm_fake
-#define access_pmintenclr pm_fake
+#define access_pmcr trap_raz_wi
+#define access_pmcntenset trap_raz_wi
+#define access_pmcntenclr trap_raz_wi
+#define access_pmovsr trap_raz_wi
+#define access_pmselr trap_raz_wi
+#define access_pmceid0 trap_raz_wi
+#define access_pmceid1 trap_raz_wi
+#define access_pmccntr trap_raz_wi
+#define access_pmxevtyper trap_raz_wi
+#define access_pmxevcntr trap_raz_wi
+#define access_pmuserenr trap_raz_wi
+#define access_pmintenset trap_raz_wi
+#define access_pmintenclr trap_raz_wi
 
 /* Architected CP15 registers.
  * CRn denotes the primary register number, but is copied to the CRm in the
@@ -532,8 +532,7 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
-	/* raz_wi */
-	(void)pm_fake(vcpu, &params, NULL);
+	(void)trap_raz_wi(vcpu, &params, NULL);
 
 	/* handled */
 	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
@@ -559,8 +558,7 @@ int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
-	/* raz_wi */
-	(void)pm_fake(vcpu, &params, NULL);
+	(void)trap_raz_wi(vcpu, &params, NULL);
 
 	/* handled */
 	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 02/11] KVM: arm: rename pm_fake handler to trap_raz_wi
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: Alex Bennee <alex.bennee@linaro.org>
---
 arch/arm/kvm/coproc.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 2e12760..9d283d9 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -229,7 +229,7 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
  * must always support PMCCNTR (the cycle counter): we just RAZ/WI for
  * all PM registers, which doesn't crash the guest kernel at least.
  */
-static bool pm_fake(struct kvm_vcpu *vcpu,
+static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 		    const struct coproc_params *p,
 		    const struct coproc_reg *r)
 {
@@ -239,19 +239,19 @@ static bool pm_fake(struct kvm_vcpu *vcpu,
 		return read_zero(vcpu, p);
 }
 
-#define access_pmcr pm_fake
-#define access_pmcntenset pm_fake
-#define access_pmcntenclr pm_fake
-#define access_pmovsr pm_fake
-#define access_pmselr pm_fake
-#define access_pmceid0 pm_fake
-#define access_pmceid1 pm_fake
-#define access_pmccntr pm_fake
-#define access_pmxevtyper pm_fake
-#define access_pmxevcntr pm_fake
-#define access_pmuserenr pm_fake
-#define access_pmintenset pm_fake
-#define access_pmintenclr pm_fake
+#define access_pmcr trap_raz_wi
+#define access_pmcntenset trap_raz_wi
+#define access_pmcntenclr trap_raz_wi
+#define access_pmovsr trap_raz_wi
+#define access_pmselr trap_raz_wi
+#define access_pmceid0 trap_raz_wi
+#define access_pmceid1 trap_raz_wi
+#define access_pmccntr trap_raz_wi
+#define access_pmxevtyper trap_raz_wi
+#define access_pmxevcntr trap_raz_wi
+#define access_pmuserenr trap_raz_wi
+#define access_pmintenset trap_raz_wi
+#define access_pmintenclr trap_raz_wi
 
 /* Architected CP15 registers.
  * CRn denotes the primary register number, but is copied to the CRm in the
@@ -532,8 +532,7 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
-	/* raz_wi */
-	(void)pm_fake(vcpu, &params, NULL);
+	(void)trap_raz_wi(vcpu, &params, NULL);
 
 	/* handled */
 	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
@@ -559,8 +558,7 @@ int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
-	/* raz_wi */
-	(void)pm_fake(vcpu, &params, NULL);
+	(void)trap_raz_wi(vcpu, &params, NULL);
 
 	/* handled */
 	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 03/11] KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly code
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

Add #ifndef __ASSEMBLY__ in hw_breakpoint.h, in order to use
the ARM_DSCR_MDBGEN macro from KVM assembly code.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: Alex Bennee <alex.bennee@linaro.org>
---
 arch/arm/include/asm/hw_breakpoint.h | 54 +++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/hw_breakpoint.h b/arch/arm/include/asm/hw_breakpoint.h
index 8e427c7..f2f4c61 100644
--- a/arch/arm/include/asm/hw_breakpoint.h
+++ b/arch/arm/include/asm/hw_breakpoint.h
@@ -3,6 +3,8 @@
 
 #ifdef __KERNEL__
 
+#ifndef __ASSEMBLY__
+
 struct task_struct;
 
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
@@ -44,6 +46,33 @@ static inline void decode_ctrl_reg(u32 reg,
 	ctrl->mismatch	= reg & 0x1;
 }
 
+struct notifier_block;
+struct perf_event;
+struct pmu;
+
+extern struct pmu perf_ops_bp;
+extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+				  int *gen_len, int *gen_type);
+extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+					   unsigned long val, void *data);
+
+extern u8 arch_get_debug_arch(void);
+extern u8 arch_get_max_wp_len(void);
+extern void clear_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+int hw_breakpoint_slots(int type);
+
+#else
+static inline void clear_ptrace_hw_breakpoint(struct task_struct *tsk) {}
+
+#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
+#endif  /* __ASSEMBLY */
+
 /* Debug architecture numbers. */
 #define ARM_DEBUG_ARCH_RESERVED	0	/* In case of ptrace ABI updates. */
 #define ARM_DEBUG_ARCH_V6	1
@@ -110,30 +139,5 @@ static inline void decode_ctrl_reg(u32 reg,
 	asm volatile("mcr p14, 0, %0, " #N "," #M ", " #OP2 : : "r" (VAL));\
 } while (0)
 
-struct notifier_block;
-struct perf_event;
-struct pmu;
-
-extern struct pmu perf_ops_bp;
-extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
-				  int *gen_len, int *gen_type);
-extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
-extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
-extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
-					   unsigned long val, void *data);
-
-extern u8 arch_get_debug_arch(void);
-extern u8 arch_get_max_wp_len(void);
-extern void clear_ptrace_hw_breakpoint(struct task_struct *tsk);
-
-int arch_install_hw_breakpoint(struct perf_event *bp);
-void arch_uninstall_hw_breakpoint(struct perf_event *bp);
-void hw_breakpoint_pmu_read(struct perf_event *bp);
-int hw_breakpoint_slots(int type);
-
-#else
-static inline void clear_ptrace_hw_breakpoint(struct task_struct *tsk) {}
-
-#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
 #endif	/* __KERNEL__ */
 #endif	/* _ARM_HW_BREAKPOINT_H */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 03/11] KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly code
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Add #ifndef __ASSEMBLY__ in hw_breakpoint.h, in order to use
the ARM_DSCR_MDBGEN macro from KVM assembly code.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: Alex Bennee <alex.bennee@linaro.org>
---
 arch/arm/include/asm/hw_breakpoint.h | 54 +++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/hw_breakpoint.h b/arch/arm/include/asm/hw_breakpoint.h
index 8e427c7..f2f4c61 100644
--- a/arch/arm/include/asm/hw_breakpoint.h
+++ b/arch/arm/include/asm/hw_breakpoint.h
@@ -3,6 +3,8 @@
 
 #ifdef __KERNEL__
 
+#ifndef __ASSEMBLY__
+
 struct task_struct;
 
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
@@ -44,6 +46,33 @@ static inline void decode_ctrl_reg(u32 reg,
 	ctrl->mismatch	= reg & 0x1;
 }
 
+struct notifier_block;
+struct perf_event;
+struct pmu;
+
+extern struct pmu perf_ops_bp;
+extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+				  int *gen_len, int *gen_type);
+extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+					   unsigned long val, void *data);
+
+extern u8 arch_get_debug_arch(void);
+extern u8 arch_get_max_wp_len(void);
+extern void clear_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+int hw_breakpoint_slots(int type);
+
+#else
+static inline void clear_ptrace_hw_breakpoint(struct task_struct *tsk) {}
+
+#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
+#endif  /* __ASSEMBLY */
+
 /* Debug architecture numbers. */
 #define ARM_DEBUG_ARCH_RESERVED	0	/* In case of ptrace ABI updates. */
 #define ARM_DEBUG_ARCH_V6	1
@@ -110,30 +139,5 @@ static inline void decode_ctrl_reg(u32 reg,
 	asm volatile("mcr p14, 0, %0, " #N "," #M ", " #OP2 : : "r" (VAL));\
 } while (0)
 
-struct notifier_block;
-struct perf_event;
-struct pmu;
-
-extern struct pmu perf_ops_bp;
-extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
-				  int *gen_len, int *gen_type);
-extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
-extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
-extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
-					   unsigned long val, void *data);
-
-extern u8 arch_get_debug_arch(void);
-extern u8 arch_get_max_wp_len(void);
-extern void clear_ptrace_hw_breakpoint(struct task_struct *tsk);
-
-int arch_install_hw_breakpoint(struct perf_event *bp);
-void arch_uninstall_hw_breakpoint(struct perf_event *bp);
-void hw_breakpoint_pmu_read(struct perf_event *bp);
-int hw_breakpoint_slots(int type);
-
-#else
-static inline void clear_ptrace_hw_breakpoint(struct task_struct *tsk) {}
-
-#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
 #endif	/* __KERNEL__ */
 #endif	/* _ARM_HW_BREAKPOINT_H */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c          | 176 ++++++++++++++++++++++++++---------------
 arch/arm/kvm/interrupts_head.S |   2 +-
 2 files changed, 112 insertions(+), 66 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 9d283d9..d23395b 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
 };
 
+static const struct coproc_reg cp14_regs[] = {
+};
+
 /* Target specific emulation tables */
 static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS];
 
@@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params,
 	return NULL;
 }
 
-static int emulate_cp15(struct kvm_vcpu *vcpu,
-			const struct coproc_params *params)
+/*
+ * emulate_cp --  tries to match a cp14/cp15 access in a handling table,
+ *                and call the corresponding trap handler.
+ *
+ * @params: pointer to the descriptor of the access
+ * @table: array of trap descriptors
+ * @num: size of the trap descriptor array
+ *
+ * Return 0 if the access has been handled, and -1 if not.
+ */
+static int emulate_cp(struct kvm_vcpu *vcpu,
+			const struct coproc_params *params,
+			const struct coproc_reg *table,
+			size_t num)
 {
-	size_t num;
-	const struct coproc_reg *table, *r;
-
-	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
-				   params->CRm, params->Op2, params->is_write);
+	const struct coproc_reg *r;
 
-	table = get_target_table(vcpu->arch.target, &num);
+	if (!table)
+		return -1;	/* Not handled */
 
-	/* Search target-specific then generic table. */
 	r = find_reg(params, table, num);
-	if (!r)
-		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
 
-	if (likely(r)) {
+	if (r) {
 		/* If we don't have an accessor, we should never get here! */
 		BUG_ON(!r->access);
 
 		if (likely(r->access(vcpu, params, r))) {
 			/* Skip instruction, since it was emulated */
 			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-			return 1;
 		}
-		/* If access function fails, it should complain. */
-	} else {
-		kvm_err("Unsupported guest CP15 access at: %08lx\n",
-			*vcpu_pc(vcpu));
-		print_cp_instr(params);
+
+		/* Handled */
+		return 0;
 	}
+
+	/* Not handled */
+	return -1;
+}
+
+static void unhandled_cp_access(struct kvm_vcpu *vcpu,
+				const struct coproc_params *params)
+{
+	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+	int cp;
+
+	switch (hsr_ec) {
+	case HSR_EC_CP15_32:
+	case HSR_EC_CP15_64:
+		cp = 15;
+		break;
+	case HSR_EC_CP14_MR:
+	case HSR_EC_CP14_64:
+		cp = 14;
+		break;
+	default:
+		WARN_ON((cp = -1));
+	}
+
+	kvm_err("Unsupported guest CP%d access at: %08lx\n",
+		cp, *vcpu_pc(vcpu));
+	print_cp_instr(params);
 	kvm_inject_undefined(vcpu);
-	return 1;
 }
 
-/**
- * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
- * @vcpu: The VCPU pointer
- * @run:  The kvm_run struct
- */
-int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
+			const struct coproc_reg *global,
+			size_t nr_global,
+			const struct coproc_reg *target_specific,
+			size_t nr_specific)
 {
 	struct coproc_params params;
 
@@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
-	return emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
+	return 1;
 }
 
 static void reset_coproc_regs(struct kvm_vcpu *vcpu,
@@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu,
 			table[i].reset(vcpu, &table[i]);
 }
 
-/**
- * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
- * @vcpu: The VCPU pointer
- * @run:  The kvm_run struct
- */
-int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
+			const struct coproc_reg *global,
+			size_t nr_global,
+			const struct coproc_reg *target_specific,
+			size_t nr_specific)
 {
 	struct coproc_params params;
 
@@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
-	return emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
+	return 1;
 }
 
 /**
- * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	struct coproc_params params;
+	const struct coproc_reg *target_specific;
+	size_t num;
 
-	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
-	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
-	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
-	params.is_64bit = true;
+	target_specific = get_target_table(vcpu->arch.target, &num);
+	return kvm_handle_cp_64(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
 
-	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
-	params.Op2 = 0;
-	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
-	params.CRm = 0;
+/**
+ * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct coproc_reg *target_specific;
+	size_t num;
 
-	(void)trap_raz_wi(vcpu, &params, NULL);
+	target_specific = get_target_table(vcpu->arch.target, &num);
+	return kvm_handle_cp_32(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
 
-	/* handled */
-	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-	return 1;
+/**
+ * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_64(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
 }
 
 /**
@@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
  */
 int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	struct coproc_params params;
-
-	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
-	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
-	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
-	params.is_64bit = false;
-
-	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
-	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
-	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
-	params.Rt2 = 0;
-
-	(void)trap_raz_wi(vcpu, &params, NULL);
-
-	/* handled */
-	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-	return 1;
+	return kvm_handle_cp_32(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
 }
 
 /******************************************************************************
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index f85c447..a20b9ad 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
  * (hardware reset value is 0) */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c          | 176 ++++++++++++++++++++++++++---------------
 arch/arm/kvm/interrupts_head.S |   2 +-
 2 files changed, 112 insertions(+), 66 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 9d283d9..d23395b 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
 };
 
+static const struct coproc_reg cp14_regs[] = {
+};
+
 /* Target specific emulation tables */
 static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS];
 
@@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params,
 	return NULL;
 }
 
-static int emulate_cp15(struct kvm_vcpu *vcpu,
-			const struct coproc_params *params)
+/*
+ * emulate_cp --  tries to match a cp14/cp15 access in a handling table,
+ *                and call the corresponding trap handler.
+ *
+ * @params: pointer to the descriptor of the access
+ * @table: array of trap descriptors
+ * @num: size of the trap descriptor array
+ *
+ * Return 0 if the access has been handled, and -1 if not.
+ */
+static int emulate_cp(struct kvm_vcpu *vcpu,
+			const struct coproc_params *params,
+			const struct coproc_reg *table,
+			size_t num)
 {
-	size_t num;
-	const struct coproc_reg *table, *r;
-
-	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
-				   params->CRm, params->Op2, params->is_write);
+	const struct coproc_reg *r;
 
-	table = get_target_table(vcpu->arch.target, &num);
+	if (!table)
+		return -1;	/* Not handled */
 
-	/* Search target-specific then generic table. */
 	r = find_reg(params, table, num);
-	if (!r)
-		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
 
-	if (likely(r)) {
+	if (r) {
 		/* If we don't have an accessor, we should never get here! */
 		BUG_ON(!r->access);
 
 		if (likely(r->access(vcpu, params, r))) {
 			/* Skip instruction, since it was emulated */
 			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-			return 1;
 		}
-		/* If access function fails, it should complain. */
-	} else {
-		kvm_err("Unsupported guest CP15 access at: %08lx\n",
-			*vcpu_pc(vcpu));
-		print_cp_instr(params);
+
+		/* Handled */
+		return 0;
 	}
+
+	/* Not handled */
+	return -1;
+}
+
+static void unhandled_cp_access(struct kvm_vcpu *vcpu,
+				const struct coproc_params *params)
+{
+	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+	int cp;
+
+	switch (hsr_ec) {
+	case HSR_EC_CP15_32:
+	case HSR_EC_CP15_64:
+		cp = 15;
+		break;
+	case HSR_EC_CP14_MR:
+	case HSR_EC_CP14_64:
+		cp = 14;
+		break;
+	default:
+		WARN_ON((cp = -1));
+	}
+
+	kvm_err("Unsupported guest CP%d access at: %08lx\n",
+		cp, *vcpu_pc(vcpu));
+	print_cp_instr(params);
 	kvm_inject_undefined(vcpu);
-	return 1;
 }
 
-/**
- * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
- * @vcpu: The VCPU pointer
- * @run:  The kvm_run struct
- */
-int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
+			const struct coproc_reg *global,
+			size_t nr_global,
+			const struct coproc_reg *target_specific,
+			size_t nr_specific)
 {
 	struct coproc_params params;
 
@@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
-	return emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
+	return 1;
 }
 
 static void reset_coproc_regs(struct kvm_vcpu *vcpu,
@@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu,
 			table[i].reset(vcpu, &table[i]);
 }
 
-/**
- * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
- * @vcpu: The VCPU pointer
- * @run:  The kvm_run struct
- */
-int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
+			const struct coproc_reg *global,
+			size_t nr_global,
+			const struct coproc_reg *target_specific,
+			size_t nr_specific)
 {
 	struct coproc_params params;
 
@@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
-	return emulate_cp15(vcpu, &params);
+	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
+		return 1;
+	if (!emulate_cp(vcpu, &params, global, nr_global))
+		return 1;
+
+	unhandled_cp_access(vcpu, &params);
+	return 1;
 }
 
 /**
- * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
  * @vcpu: The VCPU pointer
  * @run:  The kvm_run struct
  */
-int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	struct coproc_params params;
+	const struct coproc_reg *target_specific;
+	size_t num;
 
-	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
-	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
-	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
-	params.is_64bit = true;
+	target_specific = get_target_table(vcpu->arch.target, &num);
+	return kvm_handle_cp_64(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
 
-	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
-	params.Op2 = 0;
-	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
-	params.CRm = 0;
+/**
+ * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	const struct coproc_reg *target_specific;
+	size_t num;
 
-	(void)trap_raz_wi(vcpu, &params, NULL);
+	target_specific = get_target_table(vcpu->arch.target, &num);
+	return kvm_handle_cp_32(vcpu,
+				cp15_regs, ARRAY_SIZE(cp15_regs),
+				target_specific, num);
+}
 
-	/* handled */
-	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-	return 1;
+/**
+ * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
+ * @vcpu: The VCPU pointer
+ * @run:  The kvm_run struct
+ */
+int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	return kvm_handle_cp_64(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
 }
 
 /**
@@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
  */
 int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	struct coproc_params params;
-
-	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
-	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
-	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
-	params.is_64bit = false;
-
-	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
-	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
-	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
-	params.Rt2 = 0;
-
-	(void)trap_raz_wi(vcpu, &params, NULL);
-
-	/* handled */
-	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-	return 1;
+	return kvm_handle_cp_32(vcpu,
+				cp14_regs, ARRAY_SIZE(cp14_regs),
+				NULL, 0);
 }
 
 /******************************************************************************
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index f85c447..a20b9ad 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
  * (hardware reset value is 0) */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 05/11] KVM: arm: check ordering of all system register tables
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index d23395b..16d5f69 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -737,6 +737,9 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 3), is32, NULL, get_TLBTR },
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 6), is32, NULL, get_REVIDR },
 
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
+
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, NULL, get_ID_PFR0 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 1), is32, NULL, get_ID_PFR1 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 2), is32, NULL, get_ID_DFR0 },
@@ -752,9 +755,6 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 3), is32, NULL, get_ID_ISAR3 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 4), is32, NULL, get_ID_ISAR4 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 5), is32, NULL, get_ID_ISAR5 },
-
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
 };
 
 /*
@@ -1297,13 +1297,29 @@ int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
+static int check_sysreg_table(const struct coproc_reg *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("sys_reg table %p out of order (%d)\n",
+					table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void kvm_coproc_table_init(void)
 {
 	unsigned int i;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(cp15_regs); i++)
-		BUG_ON(cmp_reg(&cp15_regs[i-1], &cp15_regs[i]) >= 0);
+	BUG_ON(check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs)));
+	BUG_ON(check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_sysreg_table(invariant_cp15, ARRAY_SIZE(invariant_cp15)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_cp15); i++)
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 05/11] KVM: arm: check ordering of all system register tables
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index d23395b..16d5f69 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -737,6 +737,9 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 3), is32, NULL, get_TLBTR },
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 6), is32, NULL, get_REVIDR },
 
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
+
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, NULL, get_ID_PFR0 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 1), is32, NULL, get_ID_PFR1 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 2), is32, NULL, get_ID_DFR0 },
@@ -752,9 +755,6 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 3), is32, NULL, get_ID_ISAR3 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 4), is32, NULL, get_ID_ISAR4 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 5), is32, NULL, get_ID_ISAR5 },
-
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
 };
 
 /*
@@ -1297,13 +1297,29 @@ int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
+static int check_sysreg_table(const struct coproc_reg *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("sys_reg table %p out of order (%d)\n",
+					table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void kvm_coproc_table_init(void)
 {
 	unsigned int i;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(cp15_regs); i++)
-		BUG_ON(cmp_reg(&cp15_regs[i-1], &cp15_regs[i]) >= 0);
+	BUG_ON(check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs)));
+	BUG_ON(check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_sysreg_table(invariant_cp15, ARRAY_SIZE(invariant_cp15)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_cp15); i++)
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

Add handlers for all the 32-bit debug registers.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h  |  12 ++++
 arch/arm/include/asm/kvm_host.h |   3 +
 arch/arm/kernel/asm-offsets.c   |   1 +
 arch/arm/kvm/coproc.c           | 122 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 138 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 25410b2..ba65e05 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -52,6 +52,18 @@
 #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1 */
 #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
 
+/* 0 is reserved as an invalid value. */
+#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15) */
+#define cp14_DBGBVR15	16
+#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15) */
+#define cp14_DBGBCR15	32
+#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15) */
+#define cp14_DBGWVR15	48
+#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15) */
+#define cp14_DBGWCR15	64
+#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
+#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
+
 #define ARM_EXCEPTION_RESET	  0
 #define ARM_EXCEPTION_UNDEFINED   1
 #define ARM_EXCEPTION_SOFTWARE    2
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index d71607c..3d16820 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
 	struct vgic_cpu vgic_cpu;
 	struct arch_timer_cpu timer_cpu;
 
+	/* System control coprocessor (cp14) */
+	u32 cp14[NR_CP14_REGS];
+
 	/*
 	 * Anything that is not used directly from assembly code goes
 	 * here.
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 871b826..9158de0 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -172,6 +172,7 @@ int main(void)
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
   DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
+  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
   DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
   DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
   DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu, arch.host_cpu_context));
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 16d5f69..59b65b7 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+static bool trap_debug32(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	if (p->is_write)
+		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
+	else
+		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+
+	return true;
+}
+
+/* DBGIDR (RO) Debug ID */
+static bool trap_dbgidr(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	u32 val;
+
+	if (p->is_write)
+		return ignore_write(vcpu, p);
+
+	ARM_DBG_READ(c0, c0, 0, val);
+	*vcpu_reg(vcpu, p->Rt1) = val;
+
+	return true;
+}
+
+/* DBGDSCRint (RO) Debug Status and Control Register */
+static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	if (p->is_write)
+		return ignore_write(vcpu, p);
+
+	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+
+	return true;
+}
+
 /*
  * We could trap ID_DFR0 and tell the guest we don't support performance
  * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
@@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
 };
 
+#define DBG_BCR_BVR_WCR_WVR(n)					\
+	/* DBGBVRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
+	/* DBGBCRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
+	/* DBGWVRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
+	/* DBGWCRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
+
+/* No OS DBGBXVR machanism implemented. */
+#define DBGBXVR(n)						\
+	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
+
+/*
+ * Trapped cp14 registers. We generally ignore most of the external
+ * debug, on the principle that they don't really make sense to a
+ * guest. Revisit this one day, whould this principle change.
+ */
 static const struct coproc_reg cp14_regs[] = {
+	/* DBGIDR */
+	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
+	/* DBGDTRRXext */
+	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDSCRint */
+	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
+				NULL, cp14_DBGDSCRext },
+	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDSCRext */
+	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
+				reset_val, cp14_DBGDSCRext, 0 },
+	DBG_BCR_BVR_WCR_WVR(2),
+	/* DBGDTRRXext */
+	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(3),
+	DBG_BCR_BVR_WCR_WVR(4),
+	/* DBGDTR[RT]Xint */
+	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(5),
+	DBG_BCR_BVR_WCR_WVR(6),
+	/* DBGVCR */
+	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(7),
+	DBG_BCR_BVR_WCR_WVR(8),
+	DBG_BCR_BVR_WCR_WVR(9),
+	DBG_BCR_BVR_WCR_WVR(10),
+	DBG_BCR_BVR_WCR_WVR(11),
+	DBG_BCR_BVR_WCR_WVR(12),
+	DBG_BCR_BVR_WCR_WVR(13),
+	DBG_BCR_BVR_WCR_WVR(14),
+	DBG_BCR_BVR_WCR_WVR(15),
+
+	DBGBXVR(0),
+	/* DBGOSLAR */
+	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(1),
+	/* DBGOSLSR */
+	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(2),
+	DBGBXVR(3),
+	/* DBGOSDLRd */
+	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(4),
+	DBGBXVR(5),
+	/* DBGPRSRa */
+	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
+
+	DBGBXVR(6),
+	DBGBXVR(7),
+	DBGBXVR(8),
+	DBGBXVR(9),
+	DBGBXVR(10),
+	DBGBXVR(11),
+	DBGBXVR(12),
+	DBGBXVR(13),
+	DBGBXVR(14),
+	DBGBXVR(15),
 };
 
 /* Target specific emulation tables */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the 32-bit debug registers.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h  |  12 ++++
 arch/arm/include/asm/kvm_host.h |   3 +
 arch/arm/kernel/asm-offsets.c   |   1 +
 arch/arm/kvm/coproc.c           | 122 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 138 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 25410b2..ba65e05 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -52,6 +52,18 @@
 #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1 */
 #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
 
+/* 0 is reserved as an invalid value. */
+#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15) */
+#define cp14_DBGBVR15	16
+#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15) */
+#define cp14_DBGBCR15	32
+#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15) */
+#define cp14_DBGWVR15	48
+#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15) */
+#define cp14_DBGWCR15	64
+#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
+#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
+
 #define ARM_EXCEPTION_RESET	  0
 #define ARM_EXCEPTION_UNDEFINED   1
 #define ARM_EXCEPTION_SOFTWARE    2
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index d71607c..3d16820 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
 	struct vgic_cpu vgic_cpu;
 	struct arch_timer_cpu timer_cpu;
 
+	/* System control coprocessor (cp14) */
+	u32 cp14[NR_CP14_REGS];
+
 	/*
 	 * Anything that is not used directly from assembly code goes
 	 * here.
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 871b826..9158de0 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -172,6 +172,7 @@ int main(void)
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
   DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
+  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
   DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
   DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
   DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu, arch.host_cpu_context));
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 16d5f69..59b65b7 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+static bool trap_debug32(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	if (p->is_write)
+		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
+	else
+		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+
+	return true;
+}
+
+/* DBGIDR (RO) Debug ID */
+static bool trap_dbgidr(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	u32 val;
+
+	if (p->is_write)
+		return ignore_write(vcpu, p);
+
+	ARM_DBG_READ(c0, c0, 0, val);
+	*vcpu_reg(vcpu, p->Rt1) = val;
+
+	return true;
+}
+
+/* DBGDSCRint (RO) Debug Status and Control Register */
+static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
+			const struct coproc_params *p,
+			const struct coproc_reg *r)
+{
+	if (p->is_write)
+		return ignore_write(vcpu, p);
+
+	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+
+	return true;
+}
+
 /*
  * We could trap ID_DFR0 and tell the guest we don't support performance
  * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
@@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
 };
 
+#define DBG_BCR_BVR_WCR_WVR(n)					\
+	/* DBGBVRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
+	/* DBGBCRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
+	/* DBGWVRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
+	/* DBGWCRn */						\
+	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
+	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
+
+/* No OS DBGBXVR machanism implemented. */
+#define DBGBXVR(n)						\
+	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
+
+/*
+ * Trapped cp14 registers. We generally ignore most of the external
+ * debug, on the principle that they don't really make sense to a
+ * guest. Revisit this one day, whould this principle change.
+ */
 static const struct coproc_reg cp14_regs[] = {
+	/* DBGIDR */
+	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
+	/* DBGDTRRXext */
+	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDSCRint */
+	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
+				NULL, cp14_DBGDSCRext },
+	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDSCRext */
+	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
+				reset_val, cp14_DBGDSCRext, 0 },
+	DBG_BCR_BVR_WCR_WVR(2),
+	/* DBGDTRRXext */
+	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(3),
+	DBG_BCR_BVR_WCR_WVR(4),
+	/* DBGDTR[RT]Xint */
+	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
+	DBG_BCR_BVR_WCR_WVR(5),
+	DBG_BCR_BVR_WCR_WVR(6),
+	/* DBGVCR */
+	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
+	DBG_BCR_BVR_WCR_WVR(7),
+	DBG_BCR_BVR_WCR_WVR(8),
+	DBG_BCR_BVR_WCR_WVR(9),
+	DBG_BCR_BVR_WCR_WVR(10),
+	DBG_BCR_BVR_WCR_WVR(11),
+	DBG_BCR_BVR_WCR_WVR(12),
+	DBG_BCR_BVR_WCR_WVR(13),
+	DBG_BCR_BVR_WCR_WVR(14),
+	DBG_BCR_BVR_WCR_WVR(15),
+
+	DBGBXVR(0),
+	/* DBGOSLAR */
+	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(1),
+	/* DBGOSLSR */
+	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(2),
+	DBGBXVR(3),
+	/* DBGOSDLRd */
+	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
+	DBGBXVR(4),
+	DBGBXVR(5),
+	/* DBGPRSRa */
+	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
+
+	DBGBXVR(6),
+	DBGBXVR(7),
+	DBGBXVR(8),
+	DBGBXVR(9),
+	DBGBXVR(10),
+	DBGBXVR(11),
+	DBGBXVR(12),
+	DBGBXVR(13),
+	DBGBXVR(14),
+	DBGBXVR(15),
 };
 
 /* Target specific emulation tables */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

Add handlers for all the 64-bit debug registers.

There is an overlap between 32 and 64bit registers. Make sure that
64-bit registers preceding 32-bit ones.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 59b65b7..eeee648 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
 
 /*
+ * Architected CP14 registers.
+ *
  * Trapped cp14 registers. We generally ignore most of the external
  * debug, on the principle that they don't really make sense to a
  * guest. Revisit this one day, whould this principle change.
+ *
+ * CRn denotes the primary register number, but is copied to the CRm in the
+ * user space API for 64-bit register access in line with the terminology used
+ * in the ARM ARM.
+ * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit
+ *            registers preceding 32-bit ones.
  */
 static const struct coproc_reg cp14_regs[] = {
 	/* DBGIDR */
@@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
 	/* DBGDTRRXext */
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
 	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDRAR (64bit) */
+	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
 	/* DBGDSCRint */
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
 				NULL, cp14_DBGDSCRext },
 	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDSAR (64bit) */
+	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
 	/* DBGDSCRext */
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
 				reset_val, cp14_DBGDSCRext, 0 },
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Add handlers for all the 64-bit debug registers.

There is an overlap between 32 and 64bit registers. Make sure that
64-bit registers preceding 32-bit ones.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 59b65b7..eeee648 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
 
 /*
+ * Architected CP14 registers.
+ *
  * Trapped cp14 registers. We generally ignore most of the external
  * debug, on the principle that they don't really make sense to a
  * guest. Revisit this one day, whould this principle change.
+ *
+ * CRn denotes the primary register number, but is copied to the CRm in the
+ * user space API for 64-bit register access in line with the terminology used
+ * in the ARM ARM.
+ * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit
+ *            registers preceding 32-bit ones.
  */
 static const struct coproc_reg cp14_regs[] = {
 	/* DBGIDR */
@@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
 	/* DBGDTRRXext */
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
 	DBG_BCR_BVR_WCR_WVR(0),
+	/* DBGDRAR (64bit) */
+	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
 	/* DBGDSCRint */
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
 				NULL, cp14_DBGDSCRext },
 	DBG_BCR_BVR_WCR_WVR(1),
+	/* DBGDSAR (64bit) */
+	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
 	/* DBGDSCRext */
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
 				reset_val, cp14_DBGDSCRext, 0 },
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

The trapping code keeps track of the state of the debug registers,
allowing for the switch code to implement a lazy switching strategy.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h  |  3 +++
 arch/arm/include/asm/kvm_host.h |  3 +++
 arch/arm/kernel/asm-offsets.c   |  1 +
 arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
 arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index ba65e05..4fb64cf 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -64,6 +64,9 @@
 #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
 #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
 
+#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
+#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
+
 #define ARM_EXCEPTION_RESET	  0
 #define ARM_EXCEPTION_UNDEFINED   1
 #define ARM_EXCEPTION_SOFTWARE    2
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3d16820..09b54bf 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
 	/* System control coprocessor (cp14) */
 	u32 cp14[NR_CP14_REGS];
 
+	/* Debug state */
+	u32 debug_flags;
+
 	/*
 	 * Anything that is not used directly from assembly code goes
 	 * here.
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 9158de0..e876109 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -185,6 +185,7 @@ int main(void)
   DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu, arch.regs.fiq_regs));
   DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_pc));
   DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_cpsr));
+  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
   DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
   DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index eeee648..fc0c2ef 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+/*
+ * We want to avoid world-switching all the DBG registers all the
+ * time:
+ *
+ * - If we've touched any debug register, it is likely that we're
+ *   going to touch more of them. It then makes sense to disable the
+ *   traps and start doing the save/restore dance
+ * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
+ *   to save/restore the registers, as the guest depends on them.
+ *
+ * For this, we use a DIRTY bit, indicating the guest has modified the
+ * debug registers, used as follow:
+ *
+ * On guest entry:
+ * - If the dirty bit is set (because we're coming back from trapping),
+ *   disable the traps, save host registers, restore guest registers.
+ * - If debug is actively in use (ARM_DSCR_MDBGEN set),
+ *   set the dirty bit, disable the traps, save host registers,
+ *   restore guest registers.
+ * - Otherwise, enable the traps
+ *
+ * On guest exit:
+ * - If the dirty bit is set, save guest registers, restore host
+ *   registers and clear the dirty bit. This ensure that the host can
+ *   now use the debug registers.
+ *
+ * Notice:
+ * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
+ *   debug is always actively in use (ARM_DSCR_MDBGEN set).
+ *   We have to do the save/restore dance in this case, because the
+ *   host and the guest might use their respective debug registers
+ *   at any moment.
+ */
 static bool trap_debug32(struct kvm_vcpu *vcpu,
 			const struct coproc_params *p,
 			const struct coproc_reg *r)
 {
-	if (p->is_write)
+	if (p->is_write) {
 		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
-	else
+		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
+	} else {
 		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+	}
 
 	return true;
 }
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index a20b9ad..5662c39 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -1,4 +1,6 @@
 #include <linux/irqchip/arm-gic.h>
+#include <asm/hw_breakpoint.h>
+#include <asm/kvm_asm.h>
 #include <asm/assembler.h>
 
 #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
@@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
 .endm
 
+/* Assume vcpu pointer in vcpu reg, clobbers r5 */
+.macro skip_debug_state target
+	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+	cmp	r5, #KVM_ARM_DEBUG_DIRTY
+	bne	\target
+1:
+.endm
+
+/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
+ * is set, we do a full save/restore cycle and disable trapping.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r5, r6
+ */
+.macro compute_debug_state target
+	// Check the state of MDSCR_EL1
+	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
+	and	r6, r5, #ARM_DSCR_MDBGEN
+	cmp	r6, #0
+	beq	9998f	   // Nothing to see there
+
+	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
+	mov	r5, #KVM_ARM_DEBUG_DIRTY
+	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+	b	9999f	   // Don't skip restore
+
+9998:
+	// Otherwise load the flags from memory in case we recently
+	// trapped
+	skip_debug_state \target
+9999:
+.endm
+
+/* Assume vcpu pointer in vcpu reg, clobbers r5 */
+.macro clear_debug_dirty_bit
+	mov	r5, #0
+	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+.endm
+
 /*
  * Save the VGIC CPU state into memory
  *
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

The trapping code keeps track of the state of the debug registers,
allowing for the switch code to implement a lazy switching strategy.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h  |  3 +++
 arch/arm/include/asm/kvm_host.h |  3 +++
 arch/arm/kernel/asm-offsets.c   |  1 +
 arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
 arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index ba65e05..4fb64cf 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -64,6 +64,9 @@
 #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
 #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
 
+#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
+#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
+
 #define ARM_EXCEPTION_RESET	  0
 #define ARM_EXCEPTION_UNDEFINED   1
 #define ARM_EXCEPTION_SOFTWARE    2
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3d16820..09b54bf 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
 	/* System control coprocessor (cp14) */
 	u32 cp14[NR_CP14_REGS];
 
+	/* Debug state */
+	u32 debug_flags;
+
 	/*
 	 * Anything that is not used directly from assembly code goes
 	 * here.
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 9158de0..e876109 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -185,6 +185,7 @@ int main(void)
   DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu, arch.regs.fiq_regs));
   DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_pc));
   DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_cpsr));
+  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
   DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
   DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
   DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index eeee648..fc0c2ef 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+/*
+ * We want to avoid world-switching all the DBG registers all the
+ * time:
+ *
+ * - If we've touched any debug register, it is likely that we're
+ *   going to touch more of them. It then makes sense to disable the
+ *   traps and start doing the save/restore dance
+ * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
+ *   to save/restore the registers, as the guest depends on them.
+ *
+ * For this, we use a DIRTY bit, indicating the guest has modified the
+ * debug registers, used as follow:
+ *
+ * On guest entry:
+ * - If the dirty bit is set (because we're coming back from trapping),
+ *   disable the traps, save host registers, restore guest registers.
+ * - If debug is actively in use (ARM_DSCR_MDBGEN set),
+ *   set the dirty bit, disable the traps, save host registers,
+ *   restore guest registers.
+ * - Otherwise, enable the traps
+ *
+ * On guest exit:
+ * - If the dirty bit is set, save guest registers, restore host
+ *   registers and clear the dirty bit. This ensure that the host can
+ *   now use the debug registers.
+ *
+ * Notice:
+ * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
+ *   debug is always actively in use (ARM_DSCR_MDBGEN set).
+ *   We have to do the save/restore dance in this case, because the
+ *   host and the guest might use their respective debug registers
+ *  @any moment.
+ */
 static bool trap_debug32(struct kvm_vcpu *vcpu,
 			const struct coproc_params *p,
 			const struct coproc_reg *r)
 {
-	if (p->is_write)
+	if (p->is_write) {
 		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
-	else
+		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
+	} else {
 		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
+	}
 
 	return true;
 }
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index a20b9ad..5662c39 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -1,4 +1,6 @@
 #include <linux/irqchip/arm-gic.h>
+#include <asm/hw_breakpoint.h>
+#include <asm/kvm_asm.h>
 #include <asm/assembler.h>
 
 #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
@@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
 .endm
 
+/* Assume vcpu pointer in vcpu reg, clobbers r5 */
+.macro skip_debug_state target
+	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+	cmp	r5, #KVM_ARM_DEBUG_DIRTY
+	bne	\target
+1:
+.endm
+
+/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
+ * is set, we do a full save/restore cycle and disable trapping.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r5, r6
+ */
+.macro compute_debug_state target
+	// Check the state of MDSCR_EL1
+	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
+	and	r6, r5, #ARM_DSCR_MDBGEN
+	cmp	r6, #0
+	beq	9998f	   // Nothing to see there
+
+	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
+	mov	r5, #KVM_ARM_DEBUG_DIRTY
+	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+	b	9999f	   // Don't skip restore
+
+9998:
+	// Otherwise load the flags from memory in case we recently
+	// trapped
+	skip_debug_state \target
+9999:
+.endm
+
+/* Assume vcpu pointer in vcpu reg, clobbers r5 */
+.macro clear_debug_dirty_bit
+	mov	r5, #0
+	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
+.endm
+
 /*
  * Save the VGIC CPU state into memory
  *
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A15 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").

Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).

We have to do the save/restore dance in this case, because the host
and the guest might use their respective debug registers at any moment.

If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
the debug registers as dirty, we only save/resotre DBGDSCR.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/interrupts.S      |  16 +++
 arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 263 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 79caf79..d626275 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
 	read_cp15_state store_to_vcpu = 0
 	write_cp15_state read_from_vcpu = 1
 
+	@ Store hardware CP14 state and load guest state
+	compute_debug_state 1f
+	bl __save_host_debug_regs
+	bl __restore_guest_debug_regs
+
+1:
 	@ If the host kernel has not been configured with VFPv3 support,
 	@ then it is safer if we deny guests from using it as well.
 #ifdef CONFIG_VFPv3
@@ -201,6 +207,16 @@ after_vfp_restore:
 	mrc	p15, 0, r2, c0, c0, 5
 	mcr	p15, 4, r2, c0, c0, 5
 
+	@ Store guest CP14 state and restore host state
+	skip_debug_state 1f
+	bl __save_guest_debug_regs
+	bl __restore_host_debug_regs
+	/* Clear the dirty flag for the next run, as all the state has
+	 * already been saved. Note that we nuke the whole 32bit word.
+	 * If we ever add more flags, we'll have to be more careful...
+	 */
+	clear_debug_dirty_bit
+1:
 	@ Store guest CP15 state and restore host state
 	read_cp15_state store_to_vcpu = 1
 	write_cp15_state read_from_vcpu = 0
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 5662c39..ed406be 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -7,6 +7,7 @@
 #define VCPU_USR_SP		(VCPU_USR_REG(13))
 #define VCPU_USR_LR		(VCPU_USR_REG(14))
 #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
+#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) * 4))
 
 /*
  * Many of these macros need to access the VCPU structure, which is always
@@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers *all* registers.
  */
 .macro restore_guest_regs
-	/* reset DBGDSCR to disable debug mode */
-	mov	r2, #0
+	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
 	mcr	p14, 0, r2, c0, c2, 2
 
 	restore_guest_regs_mode svc, #VCPU_SVC_REGS
@@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	save_guest_regs_mode abt, #VCPU_ABT_REGS
 	save_guest_regs_mode und, #VCPU_UND_REGS
 	save_guest_regs_mode irq, #VCPU_IRQ_REGS
+
+	/* DBGDSCR reg */
+	mrc	p14, 0, r2, c0, c1, 0
+	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
 .endm
 
 /* Reads cp15 registers from hardware and stores them in memory
@@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
 .endm
 
+/* Assume r11/r12 in used, clobbers r2-r10 */
+.macro cp14_read_and_push Op2 skip_num
+	cmp	\skip_num, #8
+	// if (skip_num >= 8) then skip c8-c15 directly
+	bge	1f
+	adr	r2, 9998f
+	add	r2, r2, \skip_num, lsl #2
+	bx	r2
+1:
+	adr	r2, 9999f
+	sub	r3, \skip_num, #8
+	add	r2, r2, r3, lsl #2
+	bx	r2
+9998:
+	mrc	p14, 0, r10, c0, c15, \Op2
+	mrc	p14, 0, r9, c0, c14, \Op2
+	mrc	p14, 0, r8, c0, c13, \Op2
+	mrc	p14, 0, r7, c0, c12, \Op2
+	mrc	p14, 0, r6, c0, c11, \Op2
+	mrc	p14, 0, r5, c0, c10, \Op2
+	mrc	p14, 0, r4, c0, c9, \Op2
+	mrc	p14, 0, r3, c0, c8, \Op2
+	push	{r3-r10}
+9999:
+	mrc	p14, 0, r10, c0, c7, \Op2
+	mrc	p14, 0, r9, c0, c6, \Op2
+	mrc	p14, 0, r8, c0, c5, \Op2
+	mrc	p14, 0, r7, c0, c4, \Op2
+	mrc	p14, 0, r6, c0, c3, \Op2
+	mrc	p14, 0, r5, c0, c2, \Op2
+	mrc	p14, 0, r4, c0, c1, \Op2
+	mrc	p14, 0, r3, c0, c0, \Op2
+	push	{r3-r10}
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r10 */
+.macro cp14_pop_and_write Op2 skip_num
+	cmp	\skip_num, #8
+	// if (skip_num >= 8) then skip c8-c15 directly
+	bge	1f
+	adr	r2, 9998f
+	add	r2, r2, \skip_num, lsl #2
+	pop	{r3-r10}
+	bx	r2
+1:
+	adr	r2, 9999f
+	sub	r3, \skip_num, #8
+	add	r2, r2, r3, lsl #2
+	pop	{r3-r10}
+	bx	r2
+
+9998:
+	mcr	p14, 0, r10, c0, c15, \Op2
+	mcr	p14, 0, r9, c0, c14, \Op2
+	mcr	p14, 0, r8, c0, c13, \Op2
+	mcr	p14, 0, r7, c0, c12, \Op2
+	mcr	p14, 0, r6, c0, c11, \Op2
+	mcr	p14, 0, r5, c0, c10, \Op2
+	mcr	p14, 0, r4, c0, c9, \Op2
+	mcr	p14, 0, r3, c0, c8, \Op2
+
+	pop	{r3-r10}
+9999:
+	mcr	p14, 0, r10, c0, c7, \Op2
+	mcr	p14, 0, r9, c0, c6, \Op2
+	mcr	p14, 0, r8, c0, c5, \Op2
+	mcr	p14, 0, r7, c0, c4, \Op2
+	mcr	p14, 0, r6, c0, c3, \Op2
+	mcr	p14, 0, r5, c0, c2, \Op2
+	mcr	p14, 0, r4, c0, c1, \Op2
+	mcr	p14, 0, r3, c0, c0, \Op2
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r3 */
+.macro cp14_read_and_str Op2 cp14_reg0 skip_num
+	adr	r3, 1f
+	add	r3, r3, \skip_num, lsl #3
+	bx	r3
+1:
+	mrc	p14, 0, r2, c0, c15, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
+	mrc	p14, 0, r2, c0, c14, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
+	mrc	p14, 0, r2, c0, c13, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
+	mrc	p14, 0, r2, c0, c12, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
+	mrc	p14, 0, r2, c0, c11, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
+	mrc	p14, 0, r2, c0, c10, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
+	mrc	p14, 0, r2, c0, c9, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
+	mrc	p14, 0, r2, c0, c8, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
+	mrc	p14, 0, r2, c0, c7, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
+	mrc	p14, 0, r2, c0, c6, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
+	mrc	p14, 0, r2, c0, c5, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
+	mrc	p14, 0, r2, c0, c4, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
+	mrc	p14, 0, r2, c0, c3, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
+	mrc	p14, 0, r2, c0, c2, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
+	mrc	p14, 0, r2, c0, c1, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
+	mrc	p14, 0, r2, c0, c0, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r3 */
+.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
+	adr	r3, 1f
+	add	r3, r3, \skip_num, lsl #3
+	bx	r3
+1:
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
+	mcr	p14, 0, r2, c0, c15, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
+	mcr	p14, 0, r2, c0, c14, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
+	mcr	p14, 0, r2, c0, c13, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
+	mcr	p14, 0, r2, c0, c12, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
+	mcr	p14, 0, r2, c0, c11, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
+	mcr	p14, 0, r2, c0, c10, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
+	mcr	p14, 0, r2, c0, c9, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
+	mcr	p14, 0, r2, c0, c8, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
+	mcr	p14, 0, r2, c0, c7, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
+	mcr	p14, 0, r2, c0, c6, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
+	mcr	p14, 0, r2, c0, c5, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
+	mcr	p14, 0, r2, c0, c4, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
+	mcr	p14, 0, r2, c0, c3, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
+	mcr	p14, 0, r2, c0, c2, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
+	mcr	p14, 0, r2, c0, c1, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
+	mcr	p14, 0, r2, c0, c0, \Op2
+.endm
+
+/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
+.macro read_hw_dbg_num
+	mrc	p14, 0, r2, c0, c0, 0
+	ubfx	r11, r2, #24, #4
+	add	r11, r11, #1		// Extract BRPs
+	ubfx	r12, r2, #28, #4
+	add	r12, r12, #1		// Extract WRPs
+	mov	r2, #16
+	sub	r11, r2, r11		// How many BPs to skip
+	sub	r12, r2, r12		// How many WPs to skip
+.endm
+
+/* Reads cp14 registers from hardware.
+ * Writes cp14 registers in-order to the stack.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro save_host_debug_regs
+	read_hw_dbg_num
+	cp14_read_and_push #4, r11	@ DBGBVR
+	cp14_read_and_push #5, r11	@ DBGBCR
+	cp14_read_and_push #6, r12	@ DBGWVR
+	cp14_read_and_push #7, r12	@ DBGWCR
+.endm
+
+/* Reads cp14 registers from hardware.
+ * Writes cp14 registers in-order to the VCPU struct pointed to by vcpup.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro save_guest_debug_regs
+	read_hw_dbg_num
+	cp14_read_and_str #4, cp14_DBGBVR0, r11
+	cp14_read_and_str #5, cp14_DBGBCR0, r11
+	cp14_read_and_str #6, cp14_DBGWVR0, r12
+	cp14_read_and_str #7, cp14_DBGWCR0, r12
+.endm
+
+/* Reads cp14 registers in-order from the stack.
+ * Writes cp14 registers to hardware.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro restore_host_debug_regs
+	read_hw_dbg_num
+	cp14_pop_and_write #4, r11	@ DBGBVR
+	cp14_pop_and_write #5, r11	@ DBGBCR
+	cp14_pop_and_write #6, r12	@ DBGWVR
+	cp14_pop_and_write #7, r12	@ DBGWCR
+.endm
+
+/* Reads cp14 registers in-order from the VCPU struct pointed to by vcpup
+ * Writes cp14 registers to hardware.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro restore_guest_debug_regs
+	read_hw_dbg_num
+	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
+	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
+	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
+	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
+.endm
+
 /*
  * Save the VGIC CPU state into memory
  *
@@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
 .macro load_vcpu
 	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
 .endm
+
+__save_host_debug_regs:
+	save_host_debug_regs
+	bx	lr
+
+__save_guest_debug_regs:
+	save_guest_debug_regs
+	bx	lr
+
+__restore_host_debug_regs:
+	restore_host_debug_regs
+	bx	lr
+
+__restore_guest_debug_regs:
+	restore_guest_debug_regs
+	bx	lr
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A15 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").

Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).

We have to do the save/restore dance in this case, because the host
and the guest might use their respective debug registers at any moment.

If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
the debug registers as dirty, we only save/resotre DBGDSCR.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/interrupts.S      |  16 +++
 arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 263 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 79caf79..d626275 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
 	read_cp15_state store_to_vcpu = 0
 	write_cp15_state read_from_vcpu = 1
 
+	@ Store hardware CP14 state and load guest state
+	compute_debug_state 1f
+	bl __save_host_debug_regs
+	bl __restore_guest_debug_regs
+
+1:
 	@ If the host kernel has not been configured with VFPv3 support,
 	@ then it is safer if we deny guests from using it as well.
 #ifdef CONFIG_VFPv3
@@ -201,6 +207,16 @@ after_vfp_restore:
 	mrc	p15, 0, r2, c0, c0, 5
 	mcr	p15, 4, r2, c0, c0, 5
 
+	@ Store guest CP14 state and restore host state
+	skip_debug_state 1f
+	bl __save_guest_debug_regs
+	bl __restore_host_debug_regs
+	/* Clear the dirty flag for the next run, as all the state has
+	 * already been saved. Note that we nuke the whole 32bit word.
+	 * If we ever add more flags, we'll have to be more careful...
+	 */
+	clear_debug_dirty_bit
+1:
 	@ Store guest CP15 state and restore host state
 	read_cp15_state store_to_vcpu = 1
 	write_cp15_state read_from_vcpu = 0
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 5662c39..ed406be 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -7,6 +7,7 @@
 #define VCPU_USR_SP		(VCPU_USR_REG(13))
 #define VCPU_USR_LR		(VCPU_USR_REG(14))
 #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
+#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) * 4))
 
 /*
  * Many of these macros need to access the VCPU structure, which is always
@@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
  * Clobbers *all* registers.
  */
 .macro restore_guest_regs
-	/* reset DBGDSCR to disable debug mode */
-	mov	r2, #0
+	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
 	mcr	p14, 0, r2, c0, c2, 2
 
 	restore_guest_regs_mode svc, #VCPU_SVC_REGS
@@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	save_guest_regs_mode abt, #VCPU_ABT_REGS
 	save_guest_regs_mode und, #VCPU_UND_REGS
 	save_guest_regs_mode irq, #VCPU_IRQ_REGS
+
+	/* DBGDSCR reg */
+	mrc	p14, 0, r2, c0, c1, 0
+	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
 .endm
 
 /* Reads cp15 registers from hardware and stores them in memory
@@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
 .endm
 
+/* Assume r11/r12 in used, clobbers r2-r10 */
+.macro cp14_read_and_push Op2 skip_num
+	cmp	\skip_num, #8
+	// if (skip_num >= 8) then skip c8-c15 directly
+	bge	1f
+	adr	r2, 9998f
+	add	r2, r2, \skip_num, lsl #2
+	bx	r2
+1:
+	adr	r2, 9999f
+	sub	r3, \skip_num, #8
+	add	r2, r2, r3, lsl #2
+	bx	r2
+9998:
+	mrc	p14, 0, r10, c0, c15, \Op2
+	mrc	p14, 0, r9, c0, c14, \Op2
+	mrc	p14, 0, r8, c0, c13, \Op2
+	mrc	p14, 0, r7, c0, c12, \Op2
+	mrc	p14, 0, r6, c0, c11, \Op2
+	mrc	p14, 0, r5, c0, c10, \Op2
+	mrc	p14, 0, r4, c0, c9, \Op2
+	mrc	p14, 0, r3, c0, c8, \Op2
+	push	{r3-r10}
+9999:
+	mrc	p14, 0, r10, c0, c7, \Op2
+	mrc	p14, 0, r9, c0, c6, \Op2
+	mrc	p14, 0, r8, c0, c5, \Op2
+	mrc	p14, 0, r7, c0, c4, \Op2
+	mrc	p14, 0, r6, c0, c3, \Op2
+	mrc	p14, 0, r5, c0, c2, \Op2
+	mrc	p14, 0, r4, c0, c1, \Op2
+	mrc	p14, 0, r3, c0, c0, \Op2
+	push	{r3-r10}
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r10 */
+.macro cp14_pop_and_write Op2 skip_num
+	cmp	\skip_num, #8
+	// if (skip_num >= 8) then skip c8-c15 directly
+	bge	1f
+	adr	r2, 9998f
+	add	r2, r2, \skip_num, lsl #2
+	pop	{r3-r10}
+	bx	r2
+1:
+	adr	r2, 9999f
+	sub	r3, \skip_num, #8
+	add	r2, r2, r3, lsl #2
+	pop	{r3-r10}
+	bx	r2
+
+9998:
+	mcr	p14, 0, r10, c0, c15, \Op2
+	mcr	p14, 0, r9, c0, c14, \Op2
+	mcr	p14, 0, r8, c0, c13, \Op2
+	mcr	p14, 0, r7, c0, c12, \Op2
+	mcr	p14, 0, r6, c0, c11, \Op2
+	mcr	p14, 0, r5, c0, c10, \Op2
+	mcr	p14, 0, r4, c0, c9, \Op2
+	mcr	p14, 0, r3, c0, c8, \Op2
+
+	pop	{r3-r10}
+9999:
+	mcr	p14, 0, r10, c0, c7, \Op2
+	mcr	p14, 0, r9, c0, c6, \Op2
+	mcr	p14, 0, r8, c0, c5, \Op2
+	mcr	p14, 0, r7, c0, c4, \Op2
+	mcr	p14, 0, r6, c0, c3, \Op2
+	mcr	p14, 0, r5, c0, c2, \Op2
+	mcr	p14, 0, r4, c0, c1, \Op2
+	mcr	p14, 0, r3, c0, c0, \Op2
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r3 */
+.macro cp14_read_and_str Op2 cp14_reg0 skip_num
+	adr	r3, 1f
+	add	r3, r3, \skip_num, lsl #3
+	bx	r3
+1:
+	mrc	p14, 0, r2, c0, c15, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
+	mrc	p14, 0, r2, c0, c14, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
+	mrc	p14, 0, r2, c0, c13, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
+	mrc	p14, 0, r2, c0, c12, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
+	mrc	p14, 0, r2, c0, c11, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
+	mrc	p14, 0, r2, c0, c10, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
+	mrc	p14, 0, r2, c0, c9, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
+	mrc	p14, 0, r2, c0, c8, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
+	mrc	p14, 0, r2, c0, c7, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
+	mrc	p14, 0, r2, c0, c6, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
+	mrc	p14, 0, r2, c0, c5, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
+	mrc	p14, 0, r2, c0, c4, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
+	mrc	p14, 0, r2, c0, c3, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
+	mrc	p14, 0, r2, c0, c2, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
+	mrc	p14, 0, r2, c0, c1, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
+	mrc	p14, 0, r2, c0, c0, \Op2
+	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
+.endm
+
+/* Assume r11/r12 in used, clobbers r2-r3 */
+.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
+	adr	r3, 1f
+	add	r3, r3, \skip_num, lsl #3
+	bx	r3
+1:
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
+	mcr	p14, 0, r2, c0, c15, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
+	mcr	p14, 0, r2, c0, c14, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
+	mcr	p14, 0, r2, c0, c13, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
+	mcr	p14, 0, r2, c0, c12, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
+	mcr	p14, 0, r2, c0, c11, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
+	mcr	p14, 0, r2, c0, c10, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
+	mcr	p14, 0, r2, c0, c9, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
+	mcr	p14, 0, r2, c0, c8, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
+	mcr	p14, 0, r2, c0, c7, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
+	mcr	p14, 0, r2, c0, c6, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
+	mcr	p14, 0, r2, c0, c5, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
+	mcr	p14, 0, r2, c0, c4, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
+	mcr	p14, 0, r2, c0, c3, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
+	mcr	p14, 0, r2, c0, c2, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
+	mcr	p14, 0, r2, c0, c1, \Op2
+	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
+	mcr	p14, 0, r2, c0, c0, \Op2
+.endm
+
+/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
+.macro read_hw_dbg_num
+	mrc	p14, 0, r2, c0, c0, 0
+	ubfx	r11, r2, #24, #4
+	add	r11, r11, #1		// Extract BRPs
+	ubfx	r12, r2, #28, #4
+	add	r12, r12, #1		// Extract WRPs
+	mov	r2, #16
+	sub	r11, r2, r11		// How many BPs to skip
+	sub	r12, r2, r12		// How many WPs to skip
+.endm
+
+/* Reads cp14 registers from hardware.
+ * Writes cp14 registers in-order to the stack.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro save_host_debug_regs
+	read_hw_dbg_num
+	cp14_read_and_push #4, r11	@ DBGBVR
+	cp14_read_and_push #5, r11	@ DBGBCR
+	cp14_read_and_push #6, r12	@ DBGWVR
+	cp14_read_and_push #7, r12	@ DBGWCR
+.endm
+
+/* Reads cp14 registers from hardware.
+ * Writes cp14 registers in-order to the VCPU struct pointed to by vcpup.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro save_guest_debug_regs
+	read_hw_dbg_num
+	cp14_read_and_str #4, cp14_DBGBVR0, r11
+	cp14_read_and_str #5, cp14_DBGBCR0, r11
+	cp14_read_and_str #6, cp14_DBGWVR0, r12
+	cp14_read_and_str #7, cp14_DBGWCR0, r12
+.endm
+
+/* Reads cp14 registers in-order from the stack.
+ * Writes cp14 registers to hardware.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro restore_host_debug_regs
+	read_hw_dbg_num
+	cp14_pop_and_write #4, r11	@ DBGBVR
+	cp14_pop_and_write #5, r11	@ DBGBCR
+	cp14_pop_and_write #6, r12	@ DBGWVR
+	cp14_pop_and_write #7, r12	@ DBGWCR
+.endm
+
+/* Reads cp14 registers in-order from the VCPU struct pointed to by vcpup
+ * Writes cp14 registers to hardware.
+ *
+ * Assumes vcpu pointer in vcpu reg
+ *
+ * Clobbers r2-r12
+ */
+.macro restore_guest_debug_regs
+	read_hw_dbg_num
+	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
+	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
+	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
+	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
+.endm
+
 /*
  * Save the VGIC CPU state into memory
  *
@@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
 .macro load_vcpu
 	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
 .endm
+
+__save_host_debug_regs:
+	save_host_debug_regs
+	bx	lr
+
+__save_guest_debug_regs:
+	save_guest_debug_regs
+	bx	lr
+
+__restore_host_debug_regs:
+	restore_host_debug_regs
+	bx	lr
+
+__restore_guest_debug_regs:
+	restore_guest_debug_regs
+	bx	lr
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 10/11] KVM: arm: add a trace event for cp14 traps
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

There are too many cp15 traps, so we don't reuse the cp15 trace event
but add a new trace event to trace the access of debug registers.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 14 ++++++++++++++
 arch/arm/kvm/trace.h  | 30 ++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index fc0c2ef..42b720a 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -678,6 +678,13 @@ int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
+	if (global == cp14_regs)
+		trace_kvm_emulate_cp14_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+	else
+		trace_kvm_emulate_cp15_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+
 	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
 		return 1;
 	if (!emulate_cp(vcpu, &params, global, nr_global))
@@ -715,6 +722,13 @@ int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
+	if (global == cp14_regs)
+		trace_kvm_emulate_cp14_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+	else
+		trace_kvm_emulate_cp15_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+
 	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
 		return 1;
 	if (!emulate_cp(vcpu, &params, global, nr_global))
diff --git a/arch/arm/kvm/trace.h b/arch/arm/kvm/trace.h
index 0ec3539..988da03 100644
--- a/arch/arm/kvm/trace.h
+++ b/arch/arm/kvm/trace.h
@@ -159,6 +159,36 @@ TRACE_EVENT(kvm_emulate_cp15_imp,
 			__entry->CRm, __entry->Op2)
 );
 
+/* Architecturally implementation defined CP14 register access */
+TRACE_EVENT(kvm_emulate_cp14_imp,
+	TP_PROTO(unsigned long Op1, unsigned long Rt1, unsigned long CRn,
+		 unsigned long CRm, unsigned long Op2, bool is_write),
+	TP_ARGS(Op1, Rt1, CRn, CRm, Op2, is_write),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	Op1		)
+		__field(	unsigned int,	Rt1		)
+		__field(	unsigned int,	CRn		)
+		__field(	unsigned int,	CRm		)
+		__field(	unsigned int,	Op2		)
+		__field(	bool,		is_write	)
+	),
+
+	TP_fast_assign(
+		__entry->is_write		= is_write;
+		__entry->Op1			= Op1;
+		__entry->Rt1			= Rt1;
+		__entry->CRn			= CRn;
+		__entry->CRm			= CRm;
+		__entry->Op2			= Op2;
+	),
+
+	TP_printk("Implementation defined CP14: %s\tp14, %u, r%u, c%u, c%u, %u",
+			(__entry->is_write) ? "mcr" : "mrc",
+			__entry->Op1, __entry->Rt1, __entry->CRn,
+			__entry->CRm, __entry->Op2)
+);
+
 TRACE_EVENT(kvm_wfx,
 	TP_PROTO(unsigned long vcpu_pc, bool is_wfe),
 	TP_ARGS(vcpu_pc, is_wfe),
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 10/11] KVM: arm: add a trace event for cp14 traps
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

There are too many cp15 traps, so we don't reuse the cp15 trace event
but add a new trace event to trace the access of debug registers.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/coproc.c | 14 ++++++++++++++
 arch/arm/kvm/trace.h  | 30 ++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index fc0c2ef..42b720a 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -678,6 +678,13 @@ int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
 	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
 	params.CRm = 0;
 
+	if (global == cp14_regs)
+		trace_kvm_emulate_cp14_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+	else
+		trace_kvm_emulate_cp15_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+
 	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
 		return 1;
 	if (!emulate_cp(vcpu, &params, global, nr_global))
@@ -715,6 +722,13 @@ int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
 	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
 	params.Rt2 = 0;
 
+	if (global == cp14_regs)
+		trace_kvm_emulate_cp14_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+	else
+		trace_kvm_emulate_cp15_imp(params.Op1, params.Rt1, params.CRn,
+			params.CRm, params.Op2, params.is_write);
+
 	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
 		return 1;
 	if (!emulate_cp(vcpu, &params, global, nr_global))
diff --git a/arch/arm/kvm/trace.h b/arch/arm/kvm/trace.h
index 0ec3539..988da03 100644
--- a/arch/arm/kvm/trace.h
+++ b/arch/arm/kvm/trace.h
@@ -159,6 +159,36 @@ TRACE_EVENT(kvm_emulate_cp15_imp,
 			__entry->CRm, __entry->Op2)
 );
 
+/* Architecturally implementation defined CP14 register access */
+TRACE_EVENT(kvm_emulate_cp14_imp,
+	TP_PROTO(unsigned long Op1, unsigned long Rt1, unsigned long CRn,
+		 unsigned long CRm, unsigned long Op2, bool is_write),
+	TP_ARGS(Op1, Rt1, CRn, CRm, Op2, is_write),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	Op1		)
+		__field(	unsigned int,	Rt1		)
+		__field(	unsigned int,	CRn		)
+		__field(	unsigned int,	CRm		)
+		__field(	unsigned int,	Op2		)
+		__field(	bool,		is_write	)
+	),
+
+	TP_fast_assign(
+		__entry->is_write		= is_write;
+		__entry->Op1			= Op1;
+		__entry->Rt1			= Rt1;
+		__entry->CRn			= CRn;
+		__entry->CRm			= CRm;
+		__entry->Op2			= Op2;
+	),
+
+	TP_printk("Implementation defined CP14: %s\tp14, %u, r%u, c%u, c%u, %u",
+			(__entry->is_write) ? "mcr" : "mrc",
+			__entry->Op1, __entry->Rt1, __entry->CRn,
+			__entry->CRm, __entry->Op2)
+);
+
 TRACE_EVENT(kvm_wfx,
 	TP_PROTO(unsigned long vcpu_pc, bool is_wfe),
 	TP_ARGS(vcpu_pc, is_wfe),
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 11/11] KVM: arm: enable trapping of all debug registers
  2015-06-22 10:41 ` Zhichao Huang
@ 2015-06-22 10:41   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier,
	alex.bennee, will.deacon
  Cc: huangzhichao, Zhichao Huang

Enable trapping of the debug registers, allowing guests to use
the debug infrastructure.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/interrupts_head.S | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index ed406be..107bda4 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -886,10 +886,21 @@ ARM_BE8(rev	r6, r6  )
 .endm
 
 /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return
- * (hardware reset value is 0) */
+ * (hardware reset value is 0)
+ *
+ * Clobbers r2-r4
+ */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA)
+
+	// Check for KVM_ARM_DEBUG_DIRTY, and set debug to trap
+	// if not dirty.
+	ldr	r4, [vcpu, #VCPU_DEBUG_FLAGS]
+	cmp	r4, #KVM_ARM_DEBUG_DIRTY
+	beq	1f
+	orr	r3, r3,  #HDCR_TDA
+1:
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 11/11] KVM: arm: enable trapping of all debug registers
@ 2015-06-22 10:41   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-06-22 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Enable trapping of the debug registers, allowing guests to use
the debug infrastructure.

Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
---
 arch/arm/kvm/interrupts_head.S | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index ed406be..107bda4 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -886,10 +886,21 @@ ARM_BE8(rev	r6, r6  )
 .endm
 
 /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return
- * (hardware reset value is 0) */
+ * (hardware reset value is 0)
+ *
+ * Clobbers r2-r4
+ */
 .macro set_hdcr operation
 	mrc	p15, 4, r2, c1, c1, 1
-	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
+	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA)
+
+	// Check for KVM_ARM_DEBUG_DIRTY, and set debug to trap
+	// if not dirty.
+	ldr	r4, [vcpu, #VCPU_DEBUG_FLAGS]
+	cmp	r4, #KVM_ARM_DEBUG_DIRTY
+	beq	1f
+	orr	r3, r3,  #HDCR_TDA
+1:
 	.if \operation == vmentry
 	orr	r2, r2, r3		@ Trap some perfmon accesses
 	.else
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-29 15:49     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 15:49 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao, stable

On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
> Hardware debugging in guests is not intercepted currently, it means
> that a malicious guest can bring down the entire machine by writing
> to the debug registers.
> 
> This patch enable trapping of all debug registers, preventing the guests
> to access the debug registers.
> 
> This patch also disable the debug mode(DBGDSCR) in the guest world all
> the time, preventing the guests to mess with the host state.
> 
> However, it is a precursor for later patches which will need to do
> more to world switch debug states while necessary.
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_coproc.h |  3 +-
>  arch/arm/kvm/coproc.c             | 60 +++++++++++++++++++++++++++++++++++----
>  arch/arm/kvm/handle_exit.c        |  4 +--
>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
>  4 files changed, 70 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h
> index 4917c2f..e74ab0f 100644
> --- a/arch/arm/include/asm/kvm_coproc.h
> +++ b/arch/arm/include/asm/kvm_coproc.h
> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);
>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index f3d88dc..2e12760 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return 1;
>  }
>  
> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
> -{
> -	kvm_inject_undefined(vcpu);
> -	return 1;
> -}
> -
>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)
>  {
>  	/*
> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return emulate_cp15(vcpu, &params);
>  }
>  
> +/**
> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	struct coproc_params params;
> +
> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> +	params.is_64bit = true;
> +
> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> +	params.Op2 = 0;
> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> +	params.CRm = 0;

this is a complete duplicate of kvm_handle_cp15_64, can you share this
code somehow?

> +
> +	/* raz_wi */
> +	(void)pm_fake(vcpu, &params, NULL);
> +
> +	/* handled */
> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +	return 1;
> +}
> +
> +/**
> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	struct coproc_params params;
> +
> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> +	params.is_64bit = false;
> +
> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> +	params.Rt2 = 0;

this is a complete duplicate of kvm_handle_cp15_32, can you share this
code somehow?

> +
> +	/* raz_wi */
> +	(void)pm_fake(vcpu, &params, NULL);
> +
> +	/* handled */
> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +	return 1;
> +}
> +
>  /******************************************************************************
>   * Userspace API
>   *****************************************************************************/
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index 95f12b2..357ad1b 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
>  	[HSR_EC_WFI]		= kvm_handle_wfx,
>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 35e4a3a..f85c447 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	mrs	r8, LR_fiq
>  	mrs	r9, SPSR_fiq
>  	push	{r2-r9}
> +
> +	/* DBGDSCR reg */
> +	mrc	p14, 0, r2, c0, c1, 0
> +	push	{r2}

this feels like it should belong in read_cp15_state and not the gp regs
portion ?


>  .endm
>  
>  .macro pop_host_regs_mode mode
> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers all registers, in all modes, except r0 and r1.
>   */
>  .macro restore_host_regs
> +	pop	{r2}
> +	mcr	p14, 0, r2, c0, c2, 2
> +

Why are we reading the DBGDSCRint and writing the DBGDSCRext ?

>  	pop	{r2-r9}
>  	msr	r8_fiq, r2
>  	msr	r9_fiq, r3
> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers *all* registers.
>   */
>  .macro restore_guest_regs
> +	/* reset DBGDSCR to disable debug mode */
> +	mov	r2, #0
> +	mcr	p14, 0, r2, c0, c2, 2

Is it valid to write 0 in all all fields of this register?

I thought Will expressed concern about accessing this register?  Why is
it safe in this context and not before?  It seems from the spec that
this can still raise an undefined exception if an external debugger
lowers the software debug enable signal.

> +
>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
>  	restore_guest_regs_mode und, #VCPU_UND_REGS
> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>   * (hardware reset value is 0) */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-06-29 15:49     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
> Hardware debugging in guests is not intercepted currently, it means
> that a malicious guest can bring down the entire machine by writing
> to the debug registers.
> 
> This patch enable trapping of all debug registers, preventing the guests
> to access the debug registers.
> 
> This patch also disable the debug mode(DBGDSCR) in the guest world all
> the time, preventing the guests to mess with the host state.
> 
> However, it is a precursor for later patches which will need to do
> more to world switch debug states while necessary.
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_coproc.h |  3 +-
>  arch/arm/kvm/coproc.c             | 60 +++++++++++++++++++++++++++++++++++----
>  arch/arm/kvm/handle_exit.c        |  4 +--
>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
>  4 files changed, 70 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h
> index 4917c2f..e74ab0f 100644
> --- a/arch/arm/include/asm/kvm_coproc.h
> +++ b/arch/arm/include/asm/kvm_coproc.h
> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);
>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index f3d88dc..2e12760 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return 1;
>  }
>  
> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)
> -{
> -	kvm_inject_undefined(vcpu);
> -	return 1;
> -}
> -
>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)
>  {
>  	/*
> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return emulate_cp15(vcpu, &params);
>  }
>  
> +/**
> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	struct coproc_params params;
> +
> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> +	params.is_64bit = true;
> +
> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> +	params.Op2 = 0;
> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> +	params.CRm = 0;

this is a complete duplicate of kvm_handle_cp15_64, can you share this
code somehow?

> +
> +	/* raz_wi */
> +	(void)pm_fake(vcpu, &params, NULL);
> +
> +	/* handled */
> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +	return 1;
> +}
> +
> +/**
> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	struct coproc_params params;
> +
> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> +	params.is_64bit = false;
> +
> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> +	params.Rt2 = 0;

this is a complete duplicate of kvm_handle_cp15_32, can you share this
code somehow?

> +
> +	/* raz_wi */
> +	(void)pm_fake(vcpu, &params, NULL);
> +
> +	/* handled */
> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +	return 1;
> +}
> +
>  /******************************************************************************
>   * Userspace API
>   *****************************************************************************/
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index 95f12b2..357ad1b 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
>  	[HSR_EC_WFI]		= kvm_handle_wfx,
>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 35e4a3a..f85c447 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	mrs	r8, LR_fiq
>  	mrs	r9, SPSR_fiq
>  	push	{r2-r9}
> +
> +	/* DBGDSCR reg */
> +	mrc	p14, 0, r2, c0, c1, 0
> +	push	{r2}

this feels like it should belong in read_cp15_state and not the gp regs
portion ?


>  .endm
>  
>  .macro pop_host_regs_mode mode
> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers all registers, in all modes, except r0 and r1.
>   */
>  .macro restore_host_regs
> +	pop	{r2}
> +	mcr	p14, 0, r2, c0, c2, 2
> +

Why are we reading the DBGDSCRint and writing the DBGDSCRext ?

>  	pop	{r2-r9}
>  	msr	r8_fiq, r2
>  	msr	r9_fiq, r3
> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers *all* registers.
>   */
>  .macro restore_guest_regs
> +	/* reset DBGDSCR to disable debug mode */
> +	mov	r2, #0
> +	mcr	p14, 0, r2, c0, c2, 2

Is it valid to write 0 in all all fields of this register?

I thought Will expressed concern about accessing this register?  Why is
it safe in this context and not before?  It seems from the spec that
this can still raise an undefined exception if an external debugger
lowers the software debug enable signal.

> +
>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
>  	restore_guest_regs_mode und, #VCPU_UND_REGS
> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>   * (hardware reset value is 0) */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-29 19:43     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 19:43 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
> As we're about to trap a bunch of CP14 registers, let's rework
> the CP15 handling so it can be generalized and work with multiple
> tables.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/coproc.c          | 176 ++++++++++++++++++++++++++---------------
>  arch/arm/kvm/interrupts_head.S |   2 +-
>  2 files changed, 112 insertions(+), 66 deletions(-)
> 
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 9d283d9..d23395b 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>  };
>  
> +static const struct coproc_reg cp14_regs[] = {
> +};
> +
>  /* Target specific emulation tables */
>  static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS];
>  
> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params,
>  	return NULL;
>  }
>  
> -static int emulate_cp15(struct kvm_vcpu *vcpu,
> -			const struct coproc_params *params)
> +/*
> + * emulate_cp --  tries to match a cp14/cp15 access in a handling table,
> + *                and call the corresponding trap handler.
> + *
> + * @params: pointer to the descriptor of the access
> + * @table: array of trap descriptors
> + * @num: size of the trap descriptor array
> + *
> + * Return 0 if the access has been handled, and -1 if not.
> + */
> +static int emulate_cp(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *params,
> +			const struct coproc_reg *table,
> +			size_t num)
>  {
> -	size_t num;
> -	const struct coproc_reg *table, *r;
> -
> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
> -				   params->CRm, params->Op2, params->is_write);
> +	const struct coproc_reg *r;
>  
> -	table = get_target_table(vcpu->arch.target, &num);
> +	if (!table)
> +		return -1;	/* Not handled */
>  
> -	/* Search target-specific then generic table. */
>  	r = find_reg(params, table, num);
> -	if (!r)
> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
>  
> -	if (likely(r)) {
> +	if (r) {
>  		/* If we don't have an accessor, we should never get here! */
>  		BUG_ON(!r->access);
>  
>  		if (likely(r->access(vcpu, params, r))) {
>  			/* Skip instruction, since it was emulated */
>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -			return 1;
>  		}
> -		/* If access function fails, it should complain. */
> -	} else {
> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
> -			*vcpu_pc(vcpu));
> -		print_cp_instr(params);
> +
> +		/* Handled */
> +		return 0;
>  	}
> +
> +	/* Not handled */
> +	return -1;
> +}
> +
> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
> +				const struct coproc_params *params)
> +{
> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
> +	int cp;
> +
> +	switch (hsr_ec) {
> +	case HSR_EC_CP15_32:
> +	case HSR_EC_CP15_64:
> +		cp = 15;
> +		break;
> +	case HSR_EC_CP14_MR:
> +	case HSR_EC_CP14_64:
> +		cp = 14;
> +		break;
> +	default:
> +		WARN_ON((cp = -1));
> +	}
> +
> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
> +		cp, *vcpu_pc(vcpu));
> +	print_cp_instr(params);
>  	kvm_inject_undefined(vcpu);
> -	return 1;
>  }
>  
> -/**
> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
> - * @vcpu: The VCPU pointer
> - * @run:  The kvm_run struct
> - */
> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
> +			const struct coproc_reg *global,
> +			size_t nr_global,
> +			const struct coproc_reg *target_specific,
> +			size_t nr_specific)
>  {
>  	struct coproc_params params;
>  
> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>  	params.CRm = 0;
>  
> -	return emulate_cp15(vcpu, &params);
> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> +		return 1;
> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> +		return 1;
> +
> +	unhandled_cp_access(vcpu, &params);
> +	return 1;
>  }
>  
>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu,
>  			table[i].reset(vcpu, &table[i]);
>  }
>  
> -/**
> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
> - * @vcpu: The VCPU pointer
> - * @run:  The kvm_run struct
> - */
> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
> +			const struct coproc_reg *global,
> +			size_t nr_global,
> +			const struct coproc_reg *target_specific,
> +			size_t nr_specific)
>  {
>  	struct coproc_params params;
>  
> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>  	params.Rt2 = 0;
>  
> -	return emulate_cp15(vcpu, &params);
> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> +		return 1;
> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> +		return 1;
> +
> +	unhandled_cp_access(vcpu, &params);
> +	return 1;
>  }
>  
>  /**
> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
>   * @vcpu: The VCPU pointer
>   * @run:  The kvm_run struct
>   */
> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	struct coproc_params params;
> +	const struct coproc_reg *target_specific;
> +	size_t num;
>  
> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> -	params.is_64bit = true;
> +	target_specific = get_target_table(vcpu->arch.target, &num);
> +	return kvm_handle_cp_64(vcpu,
> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> +				target_specific, num);
> +}
>  
> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> -	params.Op2 = 0;
> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> -	params.CRm = 0;
> +/**
> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	const struct coproc_reg *target_specific;
> +	size_t num;
>  
> -	(void)trap_raz_wi(vcpu, &params, NULL);
> +	target_specific = get_target_table(vcpu->arch.target, &num);
> +	return kvm_handle_cp_32(vcpu,
> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> +				target_specific, num);
> +}
>  
> -	/* handled */
> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -	return 1;
> +/**
> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	return kvm_handle_cp_64(vcpu,
> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> +				NULL, 0);
>  }
>  
>  /**
> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   */
>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	struct coproc_params params;
> -
> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> -	params.is_64bit = false;
> -
> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> -	params.Rt2 = 0;
> -
> -	(void)trap_raz_wi(vcpu, &params, NULL);
> -
> -	/* handled */
> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -	return 1;
> +	return kvm_handle_cp_32(vcpu,
> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> +				NULL, 0);
>  }
>  
>  /******************************************************************************
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index f85c447..a20b9ad 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>   * (hardware reset value is 0) */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)

why do we stop trapping accesses here?

-Christoffer

>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
@ 2015-06-29 19:43     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 19:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
> As we're about to trap a bunch of CP14 registers, let's rework
> the CP15 handling so it can be generalized and work with multiple
> tables.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/coproc.c          | 176 ++++++++++++++++++++++++++---------------
>  arch/arm/kvm/interrupts_head.S |   2 +-
>  2 files changed, 112 insertions(+), 66 deletions(-)
> 
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 9d283d9..d23395b 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>  };
>  
> +static const struct coproc_reg cp14_regs[] = {
> +};
> +
>  /* Target specific emulation tables */
>  static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS];
>  
> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params,
>  	return NULL;
>  }
>  
> -static int emulate_cp15(struct kvm_vcpu *vcpu,
> -			const struct coproc_params *params)
> +/*
> + * emulate_cp --  tries to match a cp14/cp15 access in a handling table,
> + *                and call the corresponding trap handler.
> + *
> + * @params: pointer to the descriptor of the access
> + * @table: array of trap descriptors
> + * @num: size of the trap descriptor array
> + *
> + * Return 0 if the access has been handled, and -1 if not.
> + */
> +static int emulate_cp(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *params,
> +			const struct coproc_reg *table,
> +			size_t num)
>  {
> -	size_t num;
> -	const struct coproc_reg *table, *r;
> -
> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
> -				   params->CRm, params->Op2, params->is_write);
> +	const struct coproc_reg *r;
>  
> -	table = get_target_table(vcpu->arch.target, &num);
> +	if (!table)
> +		return -1;	/* Not handled */
>  
> -	/* Search target-specific then generic table. */
>  	r = find_reg(params, table, num);
> -	if (!r)
> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
>  
> -	if (likely(r)) {
> +	if (r) {
>  		/* If we don't have an accessor, we should never get here! */
>  		BUG_ON(!r->access);
>  
>  		if (likely(r->access(vcpu, params, r))) {
>  			/* Skip instruction, since it was emulated */
>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -			return 1;
>  		}
> -		/* If access function fails, it should complain. */
> -	} else {
> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
> -			*vcpu_pc(vcpu));
> -		print_cp_instr(params);
> +
> +		/* Handled */
> +		return 0;
>  	}
> +
> +	/* Not handled */
> +	return -1;
> +}
> +
> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
> +				const struct coproc_params *params)
> +{
> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
> +	int cp;
> +
> +	switch (hsr_ec) {
> +	case HSR_EC_CP15_32:
> +	case HSR_EC_CP15_64:
> +		cp = 15;
> +		break;
> +	case HSR_EC_CP14_MR:
> +	case HSR_EC_CP14_64:
> +		cp = 14;
> +		break;
> +	default:
> +		WARN_ON((cp = -1));
> +	}
> +
> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
> +		cp, *vcpu_pc(vcpu));
> +	print_cp_instr(params);
>  	kvm_inject_undefined(vcpu);
> -	return 1;
>  }
>  
> -/**
> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
> - * @vcpu: The VCPU pointer
> - * @run:  The kvm_run struct
> - */
> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
> +			const struct coproc_reg *global,
> +			size_t nr_global,
> +			const struct coproc_reg *target_specific,
> +			size_t nr_specific)
>  {
>  	struct coproc_params params;
>  
> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>  	params.CRm = 0;
>  
> -	return emulate_cp15(vcpu, &params);
> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> +		return 1;
> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> +		return 1;
> +
> +	unhandled_cp_access(vcpu, &params);
> +	return 1;
>  }
>  
>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu,
>  			table[i].reset(vcpu, &table[i]);
>  }
>  
> -/**
> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
> - * @vcpu: The VCPU pointer
> - * @run:  The kvm_run struct
> - */
> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
> +			const struct coproc_reg *global,
> +			size_t nr_global,
> +			const struct coproc_reg *target_specific,
> +			size_t nr_specific)
>  {
>  	struct coproc_params params;
>  
> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>  	params.Rt2 = 0;
>  
> -	return emulate_cp15(vcpu, &params);
> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> +		return 1;
> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> +		return 1;
> +
> +	unhandled_cp_access(vcpu, &params);
> +	return 1;
>  }
>  
>  /**
> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access
>   * @vcpu: The VCPU pointer
>   * @run:  The kvm_run struct
>   */
> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	struct coproc_params params;
> +	const struct coproc_reg *target_specific;
> +	size_t num;
>  
> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> -	params.is_64bit = true;
> +	target_specific = get_target_table(vcpu->arch.target, &num);
> +	return kvm_handle_cp_64(vcpu,
> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> +				target_specific, num);
> +}
>  
> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> -	params.Op2 = 0;
> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> -	params.CRm = 0;
> +/**
> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	const struct coproc_reg *target_specific;
> +	size_t num;
>  
> -	(void)trap_raz_wi(vcpu, &params, NULL);
> +	target_specific = get_target_table(vcpu->arch.target, &num);
> +	return kvm_handle_cp_32(vcpu,
> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> +				target_specific, num);
> +}
>  
> -	/* handled */
> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -	return 1;
> +/**
> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access
> + * @vcpu: The VCPU pointer
> + * @run:  The kvm_run struct
> + */
> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> +	return kvm_handle_cp_64(vcpu,
> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> +				NULL, 0);
>  }
>  
>  /**
> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   */
>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	struct coproc_params params;
> -
> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> -	params.is_64bit = false;
> -
> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> -	params.Rt2 = 0;
> -
> -	(void)trap_raz_wi(vcpu, &params, NULL);
> -
> -	/* handled */
> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> -	return 1;
> +	return kvm_handle_cp_32(vcpu,
> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> +				NULL, 0);
>  }
>  
>  /******************************************************************************
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index f85c447..a20b9ad 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>   * (hardware reset value is 0) */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)

why do we stop trapping accesses here?

-Christoffer

>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-29 21:16     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 21:16 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:29PM +0800, Zhichao Huang wrote:
> Add handlers for all the 32-bit debug registers.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h  |  12 ++++
>  arch/arm/include/asm/kvm_host.h |   3 +
>  arch/arm/kernel/asm-offsets.c   |   1 +
>  arch/arm/kvm/coproc.c           | 122 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 138 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 25410b2..ba65e05 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -52,6 +52,18 @@
>  #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1 */
>  #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
>  
> +/* 0 is reserved as an invalid value. */
> +#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15) */
> +#define cp14_DBGBVR15	16
> +#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15) */
> +#define cp14_DBGBCR15	32
> +#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15) */
> +#define cp14_DBGWVR15	48
> +#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15) */
> +#define cp14_DBGWCR15	64
> +#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
> +#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
> +
>  #define ARM_EXCEPTION_RESET	  0
>  #define ARM_EXCEPTION_UNDEFINED   1
>  #define ARM_EXCEPTION_SOFTWARE    2
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index d71607c..3d16820 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
>  	struct vgic_cpu vgic_cpu;
>  	struct arch_timer_cpu timer_cpu;
>  
> +	/* System control coprocessor (cp14) */
> +	u32 cp14[NR_CP14_REGS];
> +
>  	/*
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 871b826..9158de0 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -172,6 +172,7 @@ int main(void)
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
>    DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
> +  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
>    DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
>    DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
>    DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu, arch.host_cpu_context));
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 16d5f69..59b65b7 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>  	return true;
>  }
>  
> +static bool trap_debug32(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	if (p->is_write)
> +		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
> +	else
> +		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +
> +	return true;
> +}
> +
> +/* DBGIDR (RO) Debug ID */
> +static bool trap_dbgidr(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	u32 val;
> +
> +	if (p->is_write)
> +		return ignore_write(vcpu, p);
> +
> +	ARM_DBG_READ(c0, c0, 0, val);
> +	*vcpu_reg(vcpu, p->Rt1) = val;
> +
> +	return true;
> +}
> +
> +/* DBGDSCRint (RO) Debug Status and Control Register */
> +static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	if (p->is_write)
> +		return ignore_write(vcpu, p);
> +
> +	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +
> +	return true;
> +}
> +
>  /*
>   * We could trap ID_DFR0 and tell the guest we don't support performance
>   * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
> @@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>  };
>  
> +#define DBG_BCR_BVR_WCR_WVR(n)					\
> +	/* DBGBVRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
> +	/* DBGBCRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
> +	/* DBGWVRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
> +	/* DBGWCRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
> +
> +/* No OS DBGBXVR machanism implemented. */
> +#define DBGBXVR(n)						\
> +	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
> +
> +/*
> + * Trapped cp14 registers. We generally ignore most of the external
> + * debug, on the principle that they don't really make sense to a
> + * guest. Revisit this one day, whould this principle change.

s/whould/should/

> + */
>  static const struct coproc_reg cp14_regs[] = {
> +	/* DBGIDR */
> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
> +	/* DBGDTRRXext */
> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(0),
> +	/* DBGDSCRint */
> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
> +				NULL, cp14_DBGDSCRext },
> +	DBG_BCR_BVR_WCR_WVR(1),
> +	/* DBGDSCRext */
> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
> +				reset_val, cp14_DBGDSCRext, 0 },
> +	DBG_BCR_BVR_WCR_WVR(2),
> +	/* DBGDTRRXext */
> +	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },

isn't this the DBGDTRTXext register?

> +	DBG_BCR_BVR_WCR_WVR(3),
> +	DBG_BCR_BVR_WCR_WVR(4),
> +	/* DBGDTR[RT]Xint */
> +	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(5),
> +	DBG_BCR_BVR_WCR_WVR(6),
> +	/* DBGVCR */
> +	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
> +	DBG_BCR_BVR_WCR_WVR(7),
> +	DBG_BCR_BVR_WCR_WVR(8),
> +	DBG_BCR_BVR_WCR_WVR(9),
> +	DBG_BCR_BVR_WCR_WVR(10),
> +	DBG_BCR_BVR_WCR_WVR(11),
> +	DBG_BCR_BVR_WCR_WVR(12),
> +	DBG_BCR_BVR_WCR_WVR(13),
> +	DBG_BCR_BVR_WCR_WVR(14),
> +	DBG_BCR_BVR_WCR_WVR(15),
> +
> +	DBGBXVR(0),
> +	/* DBGOSLAR */
> +	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(1),
> +	/* DBGOSLSR */
> +	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(2),
> +	DBGBXVR(3),
> +	/* DBGOSDLRd */
> +	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(4),
> +	DBGBXVR(5),
> +	/* DBGPRSRa */
> +	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +
> +	DBGBXVR(6),
> +	DBGBXVR(7),
> +	DBGBXVR(8),
> +	DBGBXVR(9),
> +	DBGBXVR(10),
> +	DBGBXVR(11),
> +	DBGBXVR(12),
> +	DBGBXVR(13),
> +	DBGBXVR(14),
> +	DBGBXVR(15),
>  };
>  
>  /* Target specific emulation tables */
> -- 
> 1.7.12.4
> 
Otherwise this looks ok,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
@ 2015-06-29 21:16     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-29 21:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:29PM +0800, Zhichao Huang wrote:
> Add handlers for all the 32-bit debug registers.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h  |  12 ++++
>  arch/arm/include/asm/kvm_host.h |   3 +
>  arch/arm/kernel/asm-offsets.c   |   1 +
>  arch/arm/kvm/coproc.c           | 122 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 138 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 25410b2..ba65e05 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -52,6 +52,18 @@
>  #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1 */
>  #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
>  
> +/* 0 is reserved as an invalid value. */
> +#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15) */
> +#define cp14_DBGBVR15	16
> +#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15) */
> +#define cp14_DBGBCR15	32
> +#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15) */
> +#define cp14_DBGWVR15	48
> +#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15) */
> +#define cp14_DBGWCR15	64
> +#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
> +#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
> +
>  #define ARM_EXCEPTION_RESET	  0
>  #define ARM_EXCEPTION_UNDEFINED   1
>  #define ARM_EXCEPTION_SOFTWARE    2
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index d71607c..3d16820 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
>  	struct vgic_cpu vgic_cpu;
>  	struct arch_timer_cpu timer_cpu;
>  
> +	/* System control coprocessor (cp14) */
> +	u32 cp14[NR_CP14_REGS];
> +
>  	/*
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 871b826..9158de0 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -172,6 +172,7 @@ int main(void)
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
>    DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
> +  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
>    DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
>    DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
>    DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu, arch.host_cpu_context));
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 16d5f69..59b65b7 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>  	return true;
>  }
>  
> +static bool trap_debug32(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	if (p->is_write)
> +		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
> +	else
> +		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +
> +	return true;
> +}
> +
> +/* DBGIDR (RO) Debug ID */
> +static bool trap_dbgidr(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	u32 val;
> +
> +	if (p->is_write)
> +		return ignore_write(vcpu, p);
> +
> +	ARM_DBG_READ(c0, c0, 0, val);
> +	*vcpu_reg(vcpu, p->Rt1) = val;
> +
> +	return true;
> +}
> +
> +/* DBGDSCRint (RO) Debug Status and Control Register */
> +static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
> +			const struct coproc_params *p,
> +			const struct coproc_reg *r)
> +{
> +	if (p->is_write)
> +		return ignore_write(vcpu, p);
> +
> +	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +
> +	return true;
> +}
> +
>  /*
>   * We could trap ID_DFR0 and tell the guest we don't support performance
>   * monitoring.  Unfortunately the patch to make the kernel check ID_DFR0 was
> @@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>  };
>  
> +#define DBG_BCR_BVR_WCR_WVR(n)					\
> +	/* DBGBVRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
> +	/* DBGBCRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
> +	/* DBGWVRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
> +	/* DBGWCRn */						\
> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
> +	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
> +
> +/* No OS DBGBXVR machanism implemented. */
> +#define DBGBXVR(n)						\
> +	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
> +
> +/*
> + * Trapped cp14 registers. We generally ignore most of the external
> + * debug, on the principle that they don't really make sense to a
> + * guest. Revisit this one day, whould this principle change.

s/whould/should/

> + */
>  static const struct coproc_reg cp14_regs[] = {
> +	/* DBGIDR */
> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
> +	/* DBGDTRRXext */
> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(0),
> +	/* DBGDSCRint */
> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
> +				NULL, cp14_DBGDSCRext },
> +	DBG_BCR_BVR_WCR_WVR(1),
> +	/* DBGDSCRext */
> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
> +				reset_val, cp14_DBGDSCRext, 0 },
> +	DBG_BCR_BVR_WCR_WVR(2),
> +	/* DBGDTRRXext */
> +	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },

isn't this the DBGDTRTXext register?

> +	DBG_BCR_BVR_WCR_WVR(3),
> +	DBG_BCR_BVR_WCR_WVR(4),
> +	/* DBGDTR[RT]Xint */
> +	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
> +	DBG_BCR_BVR_WCR_WVR(5),
> +	DBG_BCR_BVR_WCR_WVR(6),
> +	/* DBGVCR */
> +	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
> +	DBG_BCR_BVR_WCR_WVR(7),
> +	DBG_BCR_BVR_WCR_WVR(8),
> +	DBG_BCR_BVR_WCR_WVR(9),
> +	DBG_BCR_BVR_WCR_WVR(10),
> +	DBG_BCR_BVR_WCR_WVR(11),
> +	DBG_BCR_BVR_WCR_WVR(12),
> +	DBG_BCR_BVR_WCR_WVR(13),
> +	DBG_BCR_BVR_WCR_WVR(14),
> +	DBG_BCR_BVR_WCR_WVR(15),
> +
> +	DBGBXVR(0),
> +	/* DBGOSLAR */
> +	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(1),
> +	/* DBGOSLSR */
> +	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(2),
> +	DBGBXVR(3),
> +	/* DBGOSDLRd */
> +	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +	DBGBXVR(4),
> +	DBGBXVR(5),
> +	/* DBGPRSRa */
> +	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
> +
> +	DBGBXVR(6),
> +	DBGBXVR(7),
> +	DBGBXVR(8),
> +	DBGBXVR(9),
> +	DBGBXVR(10),
> +	DBGBXVR(11),
> +	DBGBXVR(12),
> +	DBGBXVR(13),
> +	DBGBXVR(14),
> +	DBGBXVR(15),
>  };
>  
>  /* Target specific emulation tables */
> -- 
> 1.7.12.4
> 
Otherwise this looks ok,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30  9:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30  9:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
> The trapping code keeps track of the state of the debug registers,
> allowing for the switch code to implement a lazy switching strategy.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>  arch/arm/include/asm/kvm_host.h |  3 +++
>  arch/arm/kernel/asm-offsets.c   |  1 +
>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>  arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 86 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index ba65e05..4fb64cf 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -64,6 +64,9 @@
>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>  
> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
> +
>  #define ARM_EXCEPTION_RESET	  0
>  #define ARM_EXCEPTION_UNDEFINED   1
>  #define ARM_EXCEPTION_SOFTWARE    2
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 3d16820..09b54bf 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>  	/* System control coprocessor (cp14) */
>  	u32 cp14[NR_CP14_REGS];
>  
> +	/* Debug state */
> +	u32 debug_flags;
> +
>  	/*
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 9158de0..e876109 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -185,6 +185,7 @@ int main(void)
>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu, arch.regs.fiq_regs));
>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_pc));
>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_cpsr));
> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index eeee648..fc0c2ef 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>  	return true;
>  }
>  
> +/*
> + * We want to avoid world-switching all the DBG registers all the
> + * time:
> + *
> + * - If we've touched any debug register, it is likely that we're
> + *   going to touch more of them. It then makes sense to disable the
> + *   traps and start doing the save/restore dance
> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
> + *   to save/restore the registers, as the guest depends on them.
> + *
> + * For this, we use a DIRTY bit, indicating the guest has modified the
> + * debug registers, used as follow:
> + *
> + * On guest entry:
> + * - If the dirty bit is set (because we're coming back from trapping),
> + *   disable the traps, save host registers, restore guest registers.
> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
> + *   set the dirty bit, disable the traps, save host registers,
> + *   restore guest registers.
> + * - Otherwise, enable the traps
> + *
> + * On guest exit:
> + * - If the dirty bit is set, save guest registers, restore host
> + *   registers and clear the dirty bit. This ensure that the host can
> + *   now use the debug registers.
> + *
> + * Notice:
> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
> + *   We have to do the save/restore dance in this case, because the
> + *   host and the guest might use their respective debug registers
> + *   at any moment.

so doesn't this pretty much invalidate the whole saving/dirty effort?

Guests configured from for example multi_v7_defconfig will then act like
this and you will save/restore all these registers always.

Wouldn't a better approach be to enable trapping to hyp mode most of the
time, and simply clear the enabled bit of any host-used break- or
wathcpoints upon guest entry, perhaps maintaining a bitmap of which ones
must be re-set when exiting the guest, and thereby drastically reducing
the amount of save/restore code you'd have to perform.

Of course, you'd also have to keep track of whether the guest has any
breakpoints or watchpoints enabled for when you do the full save/restore
dance.

That would also avoid all issues surrounding accesses to DBGDSCRext
register I think.

> + */
>  static bool trap_debug32(struct kvm_vcpu *vcpu,
>  			const struct coproc_params *p,
>  			const struct coproc_reg *r)
>  {
> -	if (p->is_write)
> +	if (p->is_write) {
>  		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
> -	else
> +		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
> +	} else {
>  		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +	}
>  
>  	return true;
>  }
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index a20b9ad..5662c39 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -1,4 +1,6 @@
>  #include <linux/irqchip/arm-gic.h>
> +#include <asm/hw_breakpoint.h>
> +#include <asm/kvm_asm.h>
>  #include <asm/assembler.h>
>  
>  #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
> @@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
>  .endm
>  
> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
> +.macro skip_debug_state target
> +	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +	cmp	r5, #KVM_ARM_DEBUG_DIRTY
> +	bne	\target
> +1:
> +.endm
> +
> +/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
> + * is set, we do a full save/restore cycle and disable trapping.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r5, r6
> + */
> +.macro compute_debug_state target
> +	// Check the state of MDSCR_EL1
> +	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> +	and	r6, r5, #ARM_DSCR_MDBGEN
> +	cmp	r6, #0

you can just do 'ands' here, or even tst and you don't have to touch r6.

> +	beq	9998f	   // Nothing to see there
> +
> +	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
> +	mov	r5, #KVM_ARM_DEBUG_DIRTY
> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +	b	9999f	   // Don't skip restore
> +
> +9998:
> +	// Otherwise load the flags from memory in case we recently
> +	// trapped
> +	skip_debug_state \target
> +9999:
> +.endm
> +
> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
> +.macro clear_debug_dirty_bit
> +	mov	r5, #0
> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +.endm
> +
>  /*
>   * Save the VGIC CPU state into memory
>   *
> -- 
> 1.7.12.4
> 
Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-06-30  9:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30  9:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
> The trapping code keeps track of the state of the debug registers,
> allowing for the switch code to implement a lazy switching strategy.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>  arch/arm/include/asm/kvm_host.h |  3 +++
>  arch/arm/kernel/asm-offsets.c   |  1 +
>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>  arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 86 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index ba65e05..4fb64cf 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -64,6 +64,9 @@
>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>  
> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
> +
>  #define ARM_EXCEPTION_RESET	  0
>  #define ARM_EXCEPTION_UNDEFINED   1
>  #define ARM_EXCEPTION_SOFTWARE    2
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 3d16820..09b54bf 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>  	/* System control coprocessor (cp14) */
>  	u32 cp14[NR_CP14_REGS];
>  
> +	/* Debug state */
> +	u32 debug_flags;
> +
>  	/*
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 9158de0..e876109 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -185,6 +185,7 @@ int main(void)
>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu, arch.regs.fiq_regs));
>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_pc));
>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu, arch.regs.usr_regs.ARM_cpsr));
> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu, arch.debug_flags));
>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index eeee648..fc0c2ef 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>  	return true;
>  }
>  
> +/*
> + * We want to avoid world-switching all the DBG registers all the
> + * time:
> + *
> + * - If we've touched any debug register, it is likely that we're
> + *   going to touch more of them. It then makes sense to disable the
> + *   traps and start doing the save/restore dance
> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
> + *   to save/restore the registers, as the guest depends on them.
> + *
> + * For this, we use a DIRTY bit, indicating the guest has modified the
> + * debug registers, used as follow:
> + *
> + * On guest entry:
> + * - If the dirty bit is set (because we're coming back from trapping),
> + *   disable the traps, save host registers, restore guest registers.
> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
> + *   set the dirty bit, disable the traps, save host registers,
> + *   restore guest registers.
> + * - Otherwise, enable the traps
> + *
> + * On guest exit:
> + * - If the dirty bit is set, save guest registers, restore host
> + *   registers and clear the dirty bit. This ensure that the host can
> + *   now use the debug registers.
> + *
> + * Notice:
> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
> + *   We have to do the save/restore dance in this case, because the
> + *   host and the guest might use their respective debug registers
> + *   at any moment.

so doesn't this pretty much invalidate the whole saving/dirty effort?

Guests configured from for example multi_v7_defconfig will then act like
this and you will save/restore all these registers always.

Wouldn't a better approach be to enable trapping to hyp mode most of the
time, and simply clear the enabled bit of any host-used break- or
wathcpoints upon guest entry, perhaps maintaining a bitmap of which ones
must be re-set when exiting the guest, and thereby drastically reducing
the amount of save/restore code you'd have to perform.

Of course, you'd also have to keep track of whether the guest has any
breakpoints or watchpoints enabled for when you do the full save/restore
dance.

That would also avoid all issues surrounding accesses to DBGDSCRext
register I think.

> + */
>  static bool trap_debug32(struct kvm_vcpu *vcpu,
>  			const struct coproc_params *p,
>  			const struct coproc_reg *r)
>  {
> -	if (p->is_write)
> +	if (p->is_write) {
>  		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
> -	else
> +		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
> +	} else {
>  		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
> +	}
>  
>  	return true;
>  }
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index a20b9ad..5662c39 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -1,4 +1,6 @@
>  #include <linux/irqchip/arm-gic.h>
> +#include <asm/hw_breakpoint.h>
> +#include <asm/kvm_asm.h>
>  #include <asm/assembler.h>
>  
>  #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
> @@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
>  .endm
>  
> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
> +.macro skip_debug_state target
> +	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +	cmp	r5, #KVM_ARM_DEBUG_DIRTY
> +	bne	\target
> +1:
> +.endm
> +
> +/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
> + * is set, we do a full save/restore cycle and disable trapping.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r5, r6
> + */
> +.macro compute_debug_state target
> +	// Check the state of MDSCR_EL1
> +	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> +	and	r6, r5, #ARM_DSCR_MDBGEN
> +	cmp	r6, #0

you can just do 'ands' here, or even tst and you don't have to touch r6.

> +	beq	9998f	   // Nothing to see there
> +
> +	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
> +	mov	r5, #KVM_ARM_DEBUG_DIRTY
> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +	b	9999f	   // Don't skip restore
> +
> +9998:
> +	// Otherwise load the flags from memory in case we recently
> +	// trapped
> +	skip_debug_state \target
> +9999:
> +.endm
> +
> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
> +.macro clear_debug_dirty_bit
> +	mov	r5, #0
> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> +.endm
> +
>  /*
>   * Save the VGIC CPU state into memory
>   *
> -- 
> 1.7.12.4
> 
Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:15     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:15 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
> Implement switching of the debug registers. While the number
> of registers is massive, CPUs usually don't implement them all
> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
> of 22 registers "only").
> 
> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
> 
> We have to do the save/restore dance in this case, because the host
> and the guest might use their respective debug registers at any moment.

this sounds expensive, and I suggested an alternative approach in the
previsou patch.  In any case, measuring the impact on this on hardware
would be a great idea...

> 
> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
> the debug registers as dirty, we only save/resotre DBGDSCR.

restore

> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/interrupts.S      |  16 +++
>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 263 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index 79caf79..d626275 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
>  	read_cp15_state store_to_vcpu = 0
>  	write_cp15_state read_from_vcpu = 1
>  
> +	@ Store hardware CP14 state and load guest state
> +	compute_debug_state 1f
> +	bl __save_host_debug_regs
> +	bl __restore_guest_debug_regs
> +
> +1:
>  	@ If the host kernel has not been configured with VFPv3 support,
>  	@ then it is safer if we deny guests from using it as well.
>  #ifdef CONFIG_VFPv3
> @@ -201,6 +207,16 @@ after_vfp_restore:
>  	mrc	p15, 0, r2, c0, c0, 5
>  	mcr	p15, 4, r2, c0, c0, 5
>  
> +	@ Store guest CP14 state and restore host state
> +	skip_debug_state 1f
> +	bl __save_guest_debug_regs
> +	bl __restore_host_debug_regs
> +	/* Clear the dirty flag for the next run, as all the state has
> +	 * already been saved. Note that we nuke the whole 32bit word.
> +	 * If we ever add more flags, we'll have to be more careful...
> +	 */
> +	clear_debug_dirty_bit
> +1:
>  	@ Store guest CP15 state and restore host state
>  	read_cp15_state store_to_vcpu = 1
>  	write_cp15_state read_from_vcpu = 0
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 5662c39..ed406be 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -7,6 +7,7 @@
>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) * 4))
>  
>  /*
>   * Many of these macros need to access the VCPU structure, which is always
> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers *all* registers.
>   */
>  .macro restore_guest_regs
> -	/* reset DBGDSCR to disable debug mode */
> -	mov	r2, #0
> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>  	mcr	p14, 0, r2, c0, c2, 2
>  
>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
>  	save_guest_regs_mode und, #VCPU_UND_REGS
>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
> +
> +	/* DBGDSCR reg */
> +	mrc	p14, 0, r2, c0, c1, 0
> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>  .endm
>  
>  /* Reads cp15 registers from hardware and stores them in memory
> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>  .endm
>  
> +/* Assume r11/r12 in used, clobbers r2-r10 */
> +.macro cp14_read_and_push Op2 skip_num
> +	cmp	\skip_num, #8
> +	// if (skip_num >= 8) then skip c8-c15 directly
> +	bge	1f
> +	adr	r2, 9998f
> +	add	r2, r2, \skip_num, lsl #2
> +	bx	r2
> +1:
> +	adr	r2, 9999f
> +	sub	r3, \skip_num, #8
> +	add	r2, r2, r3, lsl #2
> +	bx	r2
> +9998:
> +	mrc	p14, 0, r10, c0, c15, \Op2
> +	mrc	p14, 0, r9, c0, c14, \Op2
> +	mrc	p14, 0, r8, c0, c13, \Op2
> +	mrc	p14, 0, r7, c0, c12, \Op2
> +	mrc	p14, 0, r6, c0, c11, \Op2
> +	mrc	p14, 0, r5, c0, c10, \Op2
> +	mrc	p14, 0, r4, c0, c9, \Op2
> +	mrc	p14, 0, r3, c0, c8, \Op2
> +	push	{r3-r10}

you probably don't want to do more stores to memory than required

> +9999:
> +	mrc	p14, 0, r10, c0, c7, \Op2
> +	mrc	p14, 0, r9, c0, c6, \Op2
> +	mrc	p14, 0, r8, c0, c5, \Op2
> +	mrc	p14, 0, r7, c0, c4, \Op2
> +	mrc	p14, 0, r6, c0, c3, \Op2
> +	mrc	p14, 0, r5, c0, c2, \Op2
> +	mrc	p14, 0, r4, c0, c1, \Op2
> +	mrc	p14, 0, r3, c0, c0, \Op2
> +	push	{r3-r10}

same

> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r10 */
> +.macro cp14_pop_and_write Op2 skip_num
> +	cmp	\skip_num, #8
> +	// if (skip_num >= 8) then skip c8-c15 directly
> +	bge	1f
> +	adr	r2, 9998f
> +	add	r2, r2, \skip_num, lsl #2
> +	pop	{r3-r10}

you probably don't want to do more loads from memory than required

> +	bx	r2
> +1:
> +	adr	r2, 9999f
> +	sub	r3, \skip_num, #8
> +	add	r2, r2, r3, lsl #2
> +	pop	{r3-r10}

same

> +	bx	r2
> +
> +9998:
> +	mcr	p14, 0, r10, c0, c15, \Op2
> +	mcr	p14, 0, r9, c0, c14, \Op2
> +	mcr	p14, 0, r8, c0, c13, \Op2
> +	mcr	p14, 0, r7, c0, c12, \Op2
> +	mcr	p14, 0, r6, c0, c11, \Op2
> +	mcr	p14, 0, r5, c0, c10, \Op2
> +	mcr	p14, 0, r4, c0, c9, \Op2
> +	mcr	p14, 0, r3, c0, c8, \Op2
> +
> +	pop	{r3-r10}
> +9999:
> +	mcr	p14, 0, r10, c0, c7, \Op2
> +	mcr	p14, 0, r9, c0, c6, \Op2
> +	mcr	p14, 0, r8, c0, c5, \Op2
> +	mcr	p14, 0, r7, c0, c4, \Op2
> +	mcr	p14, 0, r6, c0, c3, \Op2
> +	mcr	p14, 0, r5, c0, c2, \Op2
> +	mcr	p14, 0, r4, c0, c1, \Op2
> +	mcr	p14, 0, r3, c0, c0, \Op2
> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r3 */
> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
> +	adr	r3, 1f
> +	add	r3, r3, \skip_num, lsl #3
> +	bx	r3
> +1:
> +	mrc	p14, 0, r2, c0, c15, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> +	mrc	p14, 0, r2, c0, c14, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> +	mrc	p14, 0, r2, c0, c13, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> +	mrc	p14, 0, r2, c0, c12, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> +	mrc	p14, 0, r2, c0, c11, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> +	mrc	p14, 0, r2, c0, c10, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> +	mrc	p14, 0, r2, c0, c9, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> +	mrc	p14, 0, r2, c0, c8, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> +	mrc	p14, 0, r2, c0, c7, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> +	mrc	p14, 0, r2, c0, c6, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> +	mrc	p14, 0, r2, c0, c5, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> +	mrc	p14, 0, r2, c0, c4, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> +	mrc	p14, 0, r2, c0, c3, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> +	mrc	p14, 0, r2, c0, c2, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> +	mrc	p14, 0, r2, c0, c1, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> +	mrc	p14, 0, r2, c0, c0, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r3 */
> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
> +	adr	r3, 1f
> +	add	r3, r3, \skip_num, lsl #3
> +	bx	r3
> +1:
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> +	mcr	p14, 0, r2, c0, c15, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> +	mcr	p14, 0, r2, c0, c14, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> +	mcr	p14, 0, r2, c0, c13, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> +	mcr	p14, 0, r2, c0, c12, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> +	mcr	p14, 0, r2, c0, c11, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> +	mcr	p14, 0, r2, c0, c10, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> +	mcr	p14, 0, r2, c0, c9, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> +	mcr	p14, 0, r2, c0, c8, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> +	mcr	p14, 0, r2, c0, c7, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> +	mcr	p14, 0, r2, c0, c6, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> +	mcr	p14, 0, r2, c0, c5, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> +	mcr	p14, 0, r2, c0, c4, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> +	mcr	p14, 0, r2, c0, c3, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> +	mcr	p14, 0, r2, c0, c2, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> +	mcr	p14, 0, r2, c0, c1, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> +	mcr	p14, 0, r2, c0, c0, \Op2
> +.endm

can you not find some way of unifying cp14_pop_and_write with
cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?

Probably having two separate structs for the VFP state on the vcpu struct
for both the guest and the host state is one possible way of doing so.

> +
> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
> +.macro read_hw_dbg_num
> +	mrc	p14, 0, r2, c0, c0, 0
> +	ubfx	r11, r2, #24, #4
> +	add	r11, r11, #1		// Extract BRPs
> +	ubfx	r12, r2, #28, #4
> +	add	r12, r12, #1		// Extract WRPs
> +	mov	r2, #16
> +	sub	r11, r2, r11		// How many BPs to skip
> +	sub	r12, r2, r12		// How many WPs to skip
> +.endm
> +
> +/* Reads cp14 registers from hardware.

You have a lot of multi-line comments in these patches which don't start
with a separate '/*' line, as dictated by the Linux kernel coding style.
So far, I've ignored this, but please fix all these throughout the
series when you respin.

> + * Writes cp14 registers in-order to the stack.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro save_host_debug_regs
> +	read_hw_dbg_num
> +	cp14_read_and_push #4, r11	@ DBGBVR
> +	cp14_read_and_push #5, r11	@ DBGBCR
> +	cp14_read_and_push #6, r12	@ DBGWVR
> +	cp14_read_and_push #7, r12	@ DBGWCR
> +.endm
> +
> +/* Reads cp14 registers from hardware.
> + * Writes cp14 registers in-order to the VCPU struct pointed to by vcpup.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro save_guest_debug_regs
> +	read_hw_dbg_num
> +	cp14_read_and_str #4, cp14_DBGBVR0, r11

why do you need the has before the op2 field?

> +	cp14_read_and_str #5, cp14_DBGBCR0, r11
> +	cp14_read_and_str #6, cp14_DBGWVR0, r12
> +	cp14_read_and_str #7, cp14_DBGWCR0, r12
> +.endm
> +
> +/* Reads cp14 registers in-order from the stack.
> + * Writes cp14 registers to hardware.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro restore_host_debug_regs
> +	read_hw_dbg_num
> +	cp14_pop_and_write #4, r11	@ DBGBVR
> +	cp14_pop_and_write #5, r11	@ DBGBCR
> +	cp14_pop_and_write #6, r12	@ DBGWVR
> +	cp14_pop_and_write #7, r12	@ DBGWCR
> +.endm
> +
> +/* Reads cp14 registers in-order from the VCPU struct pointed to by vcpup
> + * Writes cp14 registers to hardware.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro restore_guest_debug_regs
> +	read_hw_dbg_num
> +	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
> +	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
> +	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
> +	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
> +.endm
> +
>  /*
>   * Save the VGIC CPU state into memory
>   *
> @@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
>  .macro load_vcpu
>  	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
>  .endm
> +
> +__save_host_debug_regs:
> +	save_host_debug_regs
> +	bx	lr
> +
> +__save_guest_debug_regs:
> +	save_guest_debug_regs
> +	bx	lr
> +
> +__restore_host_debug_regs:
> +	restore_host_debug_regs
> +	bx	lr
> +
> +__restore_guest_debug_regs:
> +	restore_guest_debug_regs
> +	bx	lr
> -- 
> 1.7.12.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
@ 2015-06-30 13:15     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
> Implement switching of the debug registers. While the number
> of registers is massive, CPUs usually don't implement them all
> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
> of 22 registers "only").
> 
> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
> 
> We have to do the save/restore dance in this case, because the host
> and the guest might use their respective debug registers at any moment.

this sounds expensive, and I suggested an alternative approach in the
previsou patch.  In any case, measuring the impact on this on hardware
would be a great idea...

> 
> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
> the debug registers as dirty, we only save/resotre DBGDSCR.

restore

> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/interrupts.S      |  16 +++
>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 263 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index 79caf79..d626275 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
>  	read_cp15_state store_to_vcpu = 0
>  	write_cp15_state read_from_vcpu = 1
>  
> +	@ Store hardware CP14 state and load guest state
> +	compute_debug_state 1f
> +	bl __save_host_debug_regs
> +	bl __restore_guest_debug_regs
> +
> +1:
>  	@ If the host kernel has not been configured with VFPv3 support,
>  	@ then it is safer if we deny guests from using it as well.
>  #ifdef CONFIG_VFPv3
> @@ -201,6 +207,16 @@ after_vfp_restore:
>  	mrc	p15, 0, r2, c0, c0, 5
>  	mcr	p15, 4, r2, c0, c0, 5
>  
> +	@ Store guest CP14 state and restore host state
> +	skip_debug_state 1f
> +	bl __save_guest_debug_regs
> +	bl __restore_host_debug_regs
> +	/* Clear the dirty flag for the next run, as all the state has
> +	 * already been saved. Note that we nuke the whole 32bit word.
> +	 * If we ever add more flags, we'll have to be more careful...
> +	 */
> +	clear_debug_dirty_bit
> +1:
>  	@ Store guest CP15 state and restore host state
>  	read_cp15_state store_to_vcpu = 1
>  	write_cp15_state read_from_vcpu = 0
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 5662c39..ed406be 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -7,6 +7,7 @@
>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) * 4))
>  
>  /*
>   * Many of these macros need to access the VCPU structure, which is always
> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>   * Clobbers *all* registers.
>   */
>  .macro restore_guest_regs
> -	/* reset DBGDSCR to disable debug mode */
> -	mov	r2, #0
> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>  	mcr	p14, 0, r2, c0, c2, 2
>  
>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
>  	save_guest_regs_mode und, #VCPU_UND_REGS
>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
> +
> +	/* DBGDSCR reg */
> +	mrc	p14, 0, r2, c0, c1, 0
> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>  .endm
>  
>  /* Reads cp15 registers from hardware and stores them in memory
> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>  .endm
>  
> +/* Assume r11/r12 in used, clobbers r2-r10 */
> +.macro cp14_read_and_push Op2 skip_num
> +	cmp	\skip_num, #8
> +	// if (skip_num >= 8) then skip c8-c15 directly
> +	bge	1f
> +	adr	r2, 9998f
> +	add	r2, r2, \skip_num, lsl #2
> +	bx	r2
> +1:
> +	adr	r2, 9999f
> +	sub	r3, \skip_num, #8
> +	add	r2, r2, r3, lsl #2
> +	bx	r2
> +9998:
> +	mrc	p14, 0, r10, c0, c15, \Op2
> +	mrc	p14, 0, r9, c0, c14, \Op2
> +	mrc	p14, 0, r8, c0, c13, \Op2
> +	mrc	p14, 0, r7, c0, c12, \Op2
> +	mrc	p14, 0, r6, c0, c11, \Op2
> +	mrc	p14, 0, r5, c0, c10, \Op2
> +	mrc	p14, 0, r4, c0, c9, \Op2
> +	mrc	p14, 0, r3, c0, c8, \Op2
> +	push	{r3-r10}

you probably don't want to do more stores to memory than required

> +9999:
> +	mrc	p14, 0, r10, c0, c7, \Op2
> +	mrc	p14, 0, r9, c0, c6, \Op2
> +	mrc	p14, 0, r8, c0, c5, \Op2
> +	mrc	p14, 0, r7, c0, c4, \Op2
> +	mrc	p14, 0, r6, c0, c3, \Op2
> +	mrc	p14, 0, r5, c0, c2, \Op2
> +	mrc	p14, 0, r4, c0, c1, \Op2
> +	mrc	p14, 0, r3, c0, c0, \Op2
> +	push	{r3-r10}

same

> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r10 */
> +.macro cp14_pop_and_write Op2 skip_num
> +	cmp	\skip_num, #8
> +	// if (skip_num >= 8) then skip c8-c15 directly
> +	bge	1f
> +	adr	r2, 9998f
> +	add	r2, r2, \skip_num, lsl #2
> +	pop	{r3-r10}

you probably don't want to do more loads from memory than required

> +	bx	r2
> +1:
> +	adr	r2, 9999f
> +	sub	r3, \skip_num, #8
> +	add	r2, r2, r3, lsl #2
> +	pop	{r3-r10}

same

> +	bx	r2
> +
> +9998:
> +	mcr	p14, 0, r10, c0, c15, \Op2
> +	mcr	p14, 0, r9, c0, c14, \Op2
> +	mcr	p14, 0, r8, c0, c13, \Op2
> +	mcr	p14, 0, r7, c0, c12, \Op2
> +	mcr	p14, 0, r6, c0, c11, \Op2
> +	mcr	p14, 0, r5, c0, c10, \Op2
> +	mcr	p14, 0, r4, c0, c9, \Op2
> +	mcr	p14, 0, r3, c0, c8, \Op2
> +
> +	pop	{r3-r10}
> +9999:
> +	mcr	p14, 0, r10, c0, c7, \Op2
> +	mcr	p14, 0, r9, c0, c6, \Op2
> +	mcr	p14, 0, r8, c0, c5, \Op2
> +	mcr	p14, 0, r7, c0, c4, \Op2
> +	mcr	p14, 0, r6, c0, c3, \Op2
> +	mcr	p14, 0, r5, c0, c2, \Op2
> +	mcr	p14, 0, r4, c0, c1, \Op2
> +	mcr	p14, 0, r3, c0, c0, \Op2
> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r3 */
> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
> +	adr	r3, 1f
> +	add	r3, r3, \skip_num, lsl #3
> +	bx	r3
> +1:
> +	mrc	p14, 0, r2, c0, c15, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> +	mrc	p14, 0, r2, c0, c14, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> +	mrc	p14, 0, r2, c0, c13, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> +	mrc	p14, 0, r2, c0, c12, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> +	mrc	p14, 0, r2, c0, c11, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> +	mrc	p14, 0, r2, c0, c10, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> +	mrc	p14, 0, r2, c0, c9, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> +	mrc	p14, 0, r2, c0, c8, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> +	mrc	p14, 0, r2, c0, c7, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> +	mrc	p14, 0, r2, c0, c6, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> +	mrc	p14, 0, r2, c0, c5, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> +	mrc	p14, 0, r2, c0, c4, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> +	mrc	p14, 0, r2, c0, c3, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> +	mrc	p14, 0, r2, c0, c2, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> +	mrc	p14, 0, r2, c0, c1, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> +	mrc	p14, 0, r2, c0, c0, \Op2
> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> +.endm
> +
> +/* Assume r11/r12 in used, clobbers r2-r3 */
> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
> +	adr	r3, 1f
> +	add	r3, r3, \skip_num, lsl #3
> +	bx	r3
> +1:
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> +	mcr	p14, 0, r2, c0, c15, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> +	mcr	p14, 0, r2, c0, c14, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> +	mcr	p14, 0, r2, c0, c13, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> +	mcr	p14, 0, r2, c0, c12, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> +	mcr	p14, 0, r2, c0, c11, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> +	mcr	p14, 0, r2, c0, c10, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> +	mcr	p14, 0, r2, c0, c9, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> +	mcr	p14, 0, r2, c0, c8, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> +	mcr	p14, 0, r2, c0, c7, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> +	mcr	p14, 0, r2, c0, c6, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> +	mcr	p14, 0, r2, c0, c5, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> +	mcr	p14, 0, r2, c0, c4, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> +	mcr	p14, 0, r2, c0, c3, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> +	mcr	p14, 0, r2, c0, c2, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> +	mcr	p14, 0, r2, c0, c1, \Op2
> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> +	mcr	p14, 0, r2, c0, c0, \Op2
> +.endm

can you not find some way of unifying cp14_pop_and_write with
cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?

Probably having two separate structs for the VFP state on the vcpu struct
for both the guest and the host state is one possible way of doing so.

> +
> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
> +.macro read_hw_dbg_num
> +	mrc	p14, 0, r2, c0, c0, 0
> +	ubfx	r11, r2, #24, #4
> +	add	r11, r11, #1		// Extract BRPs
> +	ubfx	r12, r2, #28, #4
> +	add	r12, r12, #1		// Extract WRPs
> +	mov	r2, #16
> +	sub	r11, r2, r11		// How many BPs to skip
> +	sub	r12, r2, r12		// How many WPs to skip
> +.endm
> +
> +/* Reads cp14 registers from hardware.

You have a lot of multi-line comments in these patches which don't start
with a separate '/*' line, as dictated by the Linux kernel coding style.
So far, I've ignored this, but please fix all these throughout the
series when you respin.

> + * Writes cp14 registers in-order to the stack.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro save_host_debug_regs
> +	read_hw_dbg_num
> +	cp14_read_and_push #4, r11	@ DBGBVR
> +	cp14_read_and_push #5, r11	@ DBGBCR
> +	cp14_read_and_push #6, r12	@ DBGWVR
> +	cp14_read_and_push #7, r12	@ DBGWCR
> +.endm
> +
> +/* Reads cp14 registers from hardware.
> + * Writes cp14 registers in-order to the VCPU struct pointed to by vcpup.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro save_guest_debug_regs
> +	read_hw_dbg_num
> +	cp14_read_and_str #4, cp14_DBGBVR0, r11

why do you need the has before the op2 field?

> +	cp14_read_and_str #5, cp14_DBGBCR0, r11
> +	cp14_read_and_str #6, cp14_DBGWVR0, r12
> +	cp14_read_and_str #7, cp14_DBGWCR0, r12
> +.endm
> +
> +/* Reads cp14 registers in-order from the stack.
> + * Writes cp14 registers to hardware.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro restore_host_debug_regs
> +	read_hw_dbg_num
> +	cp14_pop_and_write #4, r11	@ DBGBVR
> +	cp14_pop_and_write #5, r11	@ DBGBCR
> +	cp14_pop_and_write #6, r12	@ DBGWVR
> +	cp14_pop_and_write #7, r12	@ DBGWCR
> +.endm
> +
> +/* Reads cp14 registers in-order from the VCPU struct pointed to by vcpup
> + * Writes cp14 registers to hardware.
> + *
> + * Assumes vcpu pointer in vcpu reg
> + *
> + * Clobbers r2-r12
> + */
> +.macro restore_guest_debug_regs
> +	read_hw_dbg_num
> +	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
> +	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
> +	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
> +	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
> +.endm
> +
>  /*
>   * Save the VGIC CPU state into memory
>   *
> @@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
>  .macro load_vcpu
>  	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
>  .endm
> +
> +__save_host_debug_regs:
> +	save_host_debug_regs
> +	bx	lr
> +
> +__save_guest_debug_regs:
> +	save_guest_debug_regs
> +	bx	lr
> +
> +__restore_host_debug_regs:
> +	restore_host_debug_regs
> +	bx	lr
> +
> +__restore_guest_debug_regs:
> +	restore_guest_debug_regs
> +	bx	lr
> -- 
> 1.7.12.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 11/11] KVM: arm: enable trapping of all debug registers
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:19     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:19 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:34PM +0800, Zhichao Huang wrote:
> Enable trapping of the debug registers, allowing guests to use
> the debug infrastructure.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/interrupts_head.S | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index ed406be..107bda4 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -886,10 +886,21 @@ ARM_BE8(rev	r6, r6  )
>  .endm
>  
>  /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return
> - * (hardware reset value is 0) */
> + * (hardware reset value is 0)
> + *
> + * Clobbers r2-r4
> + */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA)
> +
> +	// Check for KVM_ARM_DEBUG_DIRTY, and set debug to trap
> +	// if not dirty.
> +	ldr	r4, [vcpu, #VCPU_DEBUG_FLAGS]
> +	cmp	r4, #KVM_ARM_DEBUG_DIRTY
> +	beq	1f
> +	orr	r3, r3,  #HDCR_TDA
> +1:

It may make me slightly more calm if you always unconditionally clear
HDCR_TDA on vmexit, but ok.

>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 11/11] KVM: arm: enable trapping of all debug registers
@ 2015-06-30 13:19     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:34PM +0800, Zhichao Huang wrote:
> Enable trapping of the debug registers, allowing guests to use
> the debug infrastructure.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/interrupts_head.S | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index ed406be..107bda4 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -886,10 +886,21 @@ ARM_BE8(rev	r6, r6  )
>  .endm
>  
>  /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return
> - * (hardware reset value is 0) */
> + * (hardware reset value is 0)
> + *
> + * Clobbers r2-r4
> + */
>  .macro set_hdcr operation
>  	mrc	p15, 4, r2, c1, c1, 1
> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA)
> +
> +	// Check for KVM_ARM_DEBUG_DIRTY, and set debug to trap
> +	// if not dirty.
> +	ldr	r4, [vcpu, #VCPU_DEBUG_FLAGS]
> +	cmp	r4, #KVM_ARM_DEBUG_DIRTY
> +	beq	1f
> +	orr	r3, r3,  #HDCR_TDA
> +1:

It may make me slightly more calm if you always unconditionally clear
HDCR_TDA on vmexit, but ok.

>  	.if \operation == vmentry
>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>  	.else
> -- 
> 1.7.12.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 02/11] KVM: arm: rename pm_fake handler to trap_raz_wi
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:25PM +0800, Zhichao Huang wrote:
> pm_fake doesn't quite describe what the handler does (ignoring writes
> and returning 0 for reads).
> 
> As we're about to use it (a lot) in a different context, rename it
> with a (admitedly cryptic) name that make sense for all users.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> Reviewed-by: Alex Bennee <alex.bennee@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 02/11] KVM: arm: rename pm_fake handler to trap_raz_wi
@ 2015-06-30 13:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:25PM +0800, Zhichao Huang wrote:
> pm_fake doesn't quite describe what the handler does (ignoring writes
> and returning 0 for reads).
> 
> As we're about to use it (a lot) in a different context, rename it
> with a (admitedly cryptic) name that make sense for all users.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> Reviewed-by: Alex Bennee <alex.bennee@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 03/11] KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly code
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:26PM +0800, Zhichao Huang wrote:
> Add #ifndef __ASSEMBLY__ in hw_breakpoint.h, in order to use
> the ARM_DSCR_MDBGEN macro from KVM assembly code.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> Reviewed-by: Alex Bennee <alex.bennee@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 03/11] KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly code
@ 2015-06-30 13:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:26PM +0800, Zhichao Huang wrote:
> Add #ifndef __ASSEMBLY__ in hw_breakpoint.h, in order to use
> the ARM_DSCR_MDBGEN macro from KVM assembly code.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> Reviewed-by: Alex Bennee <alex.bennee@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 05/11] KVM: arm: check ordering of all system register tables
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:28PM +0800, Zhichao Huang wrote:
> We now have multiple tables for the various system registers
> we trap. Make sure we check the order of all of them, as it is
> critical that we get the order right (been there, done that...).
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 05/11] KVM: arm: check ordering of all system register tables
@ 2015-06-30 13:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:28PM +0800, Zhichao Huang wrote:
> We now have multiple tables for the various system registers
> we trap. Make sure we check the order of all of them, as it is
> critical that we get the order right (been there, done that...).
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:30PM +0800, Zhichao Huang wrote:
> Add handlers for all the 64-bit debug registers.
> 
> There is an overlap between 32 and 64bit registers. Make sure that
> 64-bit registers preceding 32-bit ones.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/coproc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 59b65b7..eeee648 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>  
>  /*
> + * Architected CP14 registers.
> + *

belongs in other patch?

>   * Trapped cp14 registers. We generally ignore most of the external
>   * debug, on the principle that they don't really make sense to a
>   * guest. Revisit this one day, whould this principle change.
> + *
> + * CRn denotes the primary register number, but is copied to the CRm in the
> + * user space API for 64-bit register access in line with the terminology used
> + * in the ARM ARM.
> + * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit
> + *            registers preceding 32-bit ones.
>   */
>  static const struct coproc_reg cp14_regs[] = {
>  	/* DBGIDR */
> @@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
>  	/* DBGDTRRXext */
>  	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>  	DBG_BCR_BVR_WCR_WVR(0),
> +	/* DBGDRAR (64bit) */
> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
>  	/* DBGDSCRint */
>  	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>  				NULL, cp14_DBGDSCRext },
>  	DBG_BCR_BVR_WCR_WVR(1),
> +	/* DBGDSAR (64bit) */
> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
>  	/* DBGDSCRext */
>  	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>  				reset_val, cp14_DBGDSCRext, 0 },
> -- 
> 1.7.12.4
> 
Otherwise:
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
@ 2015-06-30 13:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:30PM +0800, Zhichao Huang wrote:
> Add handlers for all the 64-bit debug registers.
> 
> There is an overlap between 32 and 64bit registers. Make sure that
> 64-bit registers preceding 32-bit ones.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> ---
>  arch/arm/kvm/coproc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> index 59b65b7..eeee648 100644
> --- a/arch/arm/kvm/coproc.c
> +++ b/arch/arm/kvm/coproc.c
> @@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
>  	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>  
>  /*
> + * Architected CP14 registers.
> + *

belongs in other patch?

>   * Trapped cp14 registers. We generally ignore most of the external
>   * debug, on the principle that they don't really make sense to a
>   * guest. Revisit this one day, whould this principle change.
> + *
> + * CRn denotes the primary register number, but is copied to the CRm in the
> + * user space API for 64-bit register access in line with the terminology used
> + * in the ARM ARM.
> + * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit
> + *            registers preceding 32-bit ones.
>   */
>  static const struct coproc_reg cp14_regs[] = {
>  	/* DBGIDR */
> @@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
>  	/* DBGDTRRXext */
>  	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>  	DBG_BCR_BVR_WCR_WVR(0),
> +	/* DBGDRAR (64bit) */
> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
>  	/* DBGDSCRint */
>  	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>  				NULL, cp14_DBGDSCRext },
>  	DBG_BCR_BVR_WCR_WVR(1),
> +	/* DBGDSAR (64bit) */
> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
>  	/* DBGDSCRext */
>  	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>  				reset_val, cp14_DBGDSCRext, 0 },
> -- 
> 1.7.12.4
> 
Otherwise:
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/11] KVM: arm: add a trace event for cp14 traps
  2015-06-22 10:41   ` Zhichao Huang
@ 2015-06-30 13:20     ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Mon, Jun 22, 2015 at 06:41:33PM +0800, Zhichao Huang wrote:
> There are too many cp15 traps, so we don't reuse the cp15 trace event
> but add a new trace event to trace the access of debug registers.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 10/11] KVM: arm: add a trace event for cp14 traps
@ 2015-06-30 13:20     ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-06-30 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 22, 2015 at 06:41:33PM +0800, Zhichao Huang wrote:
> There are too many cp15 traps, so we don't reuse the cp15 trace event
> but add a new trace event to trace the access of debug registers.
> 
> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
  2015-06-29 15:49     ` Christoffer Dall
@ 2015-07-01  7:04       ` zichao
  -1 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:04 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao, stable



On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
>> Hardware debugging in guests is not intercepted currently, it means
>> that a malicious guest can bring down the entire machine by writing
>> to the debug registers.
>> 
>> This patch enable trapping of all debug registers, preventing the
>guests
>> to access the debug registers.
>> 
>> This patch also disable the debug mode(DBGDSCR) in the guest world
>all
>> the time, preventing the guests to mess with the host state.
>> 
>> However, it is a precursor for later patches which will need to do
>> more to world switch debug states while necessary.
>> 
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_coproc.h |  3 +-
>>  arch/arm/kvm/coproc.c             | 60
>+++++++++++++++++++++++++++++++++++----
>>  arch/arm/kvm/handle_exit.c        |  4 +--
>>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
>>  4 files changed, 70 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/arm/include/asm/kvm_coproc.h
>b/arch/arm/include/asm/kvm_coproc.h
>> index 4917c2f..e74ab0f 100644
>> --- a/arch/arm/include/asm/kvm_coproc.h
>> +++ b/arch/arm/include/asm/kvm_coproc.h
>> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct
>kvm_coproc_target_table *table);
>>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index f3d88dc..2e12760 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu
>*vcpu, struct kvm_run *run)
>>  	return 1;
>>  }
>>  
>> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run)
>> -{
>> -	kvm_inject_undefined(vcpu);
>> -	return 1;
>> -}
>> -
>>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct
>coproc_reg *r)
>>  {
>>  	/*
>> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	return emulate_cp15(vcpu, &params);
>>  }
>>  
>> +/**
>> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	struct coproc_params params;
>> +
>> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> +	params.is_64bit = true;
>> +
>> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
>> +	params.Op2 = 0;
>> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> +	params.CRm = 0;
>
>this is a complete duplicate of kvm_handle_cp15_64, can you share this
>code somehow?
>

This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11].

Should I take the patch [04/11] ahead of current patch [01/11] ?

>> +
>> +	/* raz_wi */
>> +	(void)pm_fake(vcpu, &params, NULL);
>> +
>> +	/* handled */
>> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +	return 1;
>> +}
>> +
>> +/**
>> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	struct coproc_params params;
>> +
>> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> +	params.is_64bit = false;
>> +
>> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
>> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>> +	params.Rt2 = 0;
>
>this is a complete duplicate of kvm_handle_cp15_32, can you share this
>code somehow?
>
>> +
>> +	/* raz_wi */
>> +	(void)pm_fake(vcpu, &params, NULL);
>> +
>> +	/* handled */
>> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +	return 1;
>> +}
>> +
>> 
>/******************************************************************************
>>   * Userspace API
>>  
>*****************************************************************************/
>> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
>> index 95f12b2..357ad1b 100644
>> --- a/arch/arm/kvm/handle_exit.c
>> +++ b/arch/arm/kvm/handle_exit.c
>> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
>>  	[HSR_EC_WFI]		= kvm_handle_wfx,
>>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
>>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
>> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
>> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
>>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
>> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
>> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
>>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
>>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
>>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
>> diff --git a/arch/arm/kvm/interrupts_head.S
>b/arch/arm/kvm/interrupts_head.S
>> index 35e4a3a..f85c447 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	mrs	r8, LR_fiq
>>  	mrs	r9, SPSR_fiq
>>  	push	{r2-r9}
>> +
>> +	/* DBGDSCR reg */
>> +	mrc	p14, 0, r2, c0, c1, 0
>> +	push	{r2}
>
>this feels like it should belong in read_cp15_state and not the gp regs
>portion ?
>

Happy to move it. But moving the cp14 regs to read/write_cp15_state still seems no very appropriate. Should I move it to __kvm_vcpu_return and __kvm_vcpu_run?

Another reason might be that, I want to disable debug mode (DBGDSCR) as early as possible.

>
>>  .endm
>>  
>>  .macro pop_host_regs_mode mode
>> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers all registers, in all modes, except r0 and r1.
>>   */
>>  .macro restore_host_regs
>> +	pop	{r2}
>> +	mcr	p14, 0, r2, c0, c2, 2
>> +
>
>Why are we reading the DBGDSCRint and writing the DBGDSCRext ?

Because the DBGDSCRint is read-only, and I borrowed the operation from kernel.

arch/arm/kernel/hw_breakpoint.c:
ARM_DBG_READ(c0, c1, 0, dscr)
ARM_DBG_WRITE(c0, c2, 2, dscr)
>
>>  	pop	{r2-r9}
>>  	msr	r8_fiq, r2
>>  	msr	r9_fiq, r3
>> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers *all* registers.
>>   */
>>  .macro restore_guest_regs
>> +	/* reset DBGDSCR to disable debug mode */
>> +	mov	r2, #0
>> +	mcr	p14, 0, r2, c0, c2, 2
>
>Is it valid to write 0 in all all fields of this register?

I'm afraid of it too, although it tests ok. Does Will have any suggestions?
>
>I thought Will expressed concern about accessing this register?  Why is
>it safe in this context and not before?  It seems from the spec that
>this can still raise an undefined exception if an external debugger
>lowers the software debug enable signal.
>
>> +
>>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
>>  	restore_guest_regs_mode und, #VCPU_UND_REGS
>> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>>   * (hardware reset value is 0) */
>>  .macro set_hdcr operation
>>  	mrc	p15, 4, r2, c1, c1, 1
>> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
>> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>>  	.if \operation == vmentry
>>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>>  	.else
>> -- 
>> 1.7.12.4
>> 
>
>Thanks,
>-Christoffer

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-07-01  7:04       ` zichao
  0 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:04 UTC (permalink / raw)
  To: linux-arm-kernel



On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
>> Hardware debugging in guests is not intercepted currently, it means
>> that a malicious guest can bring down the entire machine by writing
>> to the debug registers.
>> 
>> This patch enable trapping of all debug registers, preventing the
>guests
>> to access the debug registers.
>> 
>> This patch also disable the debug mode(DBGDSCR) in the guest world
>all
>> the time, preventing the guests to mess with the host state.
>> 
>> However, it is a precursor for later patches which will need to do
>> more to world switch debug states while necessary.
>> 
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_coproc.h |  3 +-
>>  arch/arm/kvm/coproc.c             | 60
>+++++++++++++++++++++++++++++++++++----
>>  arch/arm/kvm/handle_exit.c        |  4 +--
>>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
>>  4 files changed, 70 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/arm/include/asm/kvm_coproc.h
>b/arch/arm/include/asm/kvm_coproc.h
>> index 4917c2f..e74ab0f 100644
>> --- a/arch/arm/include/asm/kvm_coproc.h
>> +++ b/arch/arm/include/asm/kvm_coproc.h
>> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct
>kvm_coproc_target_table *table);
>>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run);
>> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index f3d88dc..2e12760 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu
>*vcpu, struct kvm_run *run)
>>  	return 1;
>>  }
>>  
>> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
>*run)
>> -{
>> -	kvm_inject_undefined(vcpu);
>> -	return 1;
>> -}
>> -
>>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct
>coproc_reg *r)
>>  {
>>  	/*
>> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	return emulate_cp15(vcpu, &params);
>>  }
>>  
>> +/**
>> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	struct coproc_params params;
>> +
>> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> +	params.is_64bit = true;
>> +
>> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
>> +	params.Op2 = 0;
>> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> +	params.CRm = 0;
>
>this is a complete duplicate of kvm_handle_cp15_64, can you share this
>code somehow?
>

This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11].

Should I take the patch [04/11] ahead of current patch [01/11] ?

>> +
>> +	/* raz_wi */
>> +	(void)pm_fake(vcpu, &params, NULL);
>> +
>> +	/* handled */
>> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +	return 1;
>> +}
>> +
>> +/**
>> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	struct coproc_params params;
>> +
>> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> +	params.is_64bit = false;
>> +
>> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
>> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>> +	params.Rt2 = 0;
>
>this is a complete duplicate of kvm_handle_cp15_32, can you share this
>code somehow?
>
>> +
>> +	/* raz_wi */
>> +	(void)pm_fake(vcpu, &params, NULL);
>> +
>> +	/* handled */
>> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +	return 1;
>> +}
>> +
>> 
>/******************************************************************************
>>   * Userspace API
>>  
>*****************************************************************************/
>> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
>> index 95f12b2..357ad1b 100644
>> --- a/arch/arm/kvm/handle_exit.c
>> +++ b/arch/arm/kvm/handle_exit.c
>> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
>>  	[HSR_EC_WFI]		= kvm_handle_wfx,
>>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
>>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
>> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
>> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
>>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
>> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
>> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
>>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
>>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
>>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
>> diff --git a/arch/arm/kvm/interrupts_head.S
>b/arch/arm/kvm/interrupts_head.S
>> index 35e4a3a..f85c447 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	mrs	r8, LR_fiq
>>  	mrs	r9, SPSR_fiq
>>  	push	{r2-r9}
>> +
>> +	/* DBGDSCR reg */
>> +	mrc	p14, 0, r2, c0, c1, 0
>> +	push	{r2}
>
>this feels like it should belong in read_cp15_state and not the gp regs
>portion ?
>

Happy to move it. But moving the cp14 regs to read/write_cp15_state still seems no very appropriate. Should I move it to __kvm_vcpu_return and __kvm_vcpu_run?

Another reason might be that, I want to disable debug mode (DBGDSCR) as early as possible.

>
>>  .endm
>>  
>>  .macro pop_host_regs_mode mode
>> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers all registers, in all modes, except r0 and r1.
>>   */
>>  .macro restore_host_regs
>> +	pop	{r2}
>> +	mcr	p14, 0, r2, c0, c2, 2
>> +
>
>Why are we reading the DBGDSCRint and writing the DBGDSCRext ?

Because the DBGDSCRint is read-only, and I borrowed the operation from kernel.

arch/arm/kernel/hw_breakpoint.c:
ARM_DBG_READ(c0, c1, 0, dscr)
ARM_DBG_WRITE(c0, c2, 2, dscr)
>
>>  	pop	{r2-r9}
>>  	msr	r8_fiq, r2
>>  	msr	r9_fiq, r3
>> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers *all* registers.
>>   */
>>  .macro restore_guest_regs
>> +	/* reset DBGDSCR to disable debug mode */
>> +	mov	r2, #0
>> +	mcr	p14, 0, r2, c0, c2, 2
>
>Is it valid to write 0 in all all fields of this register?

I'm afraid of it too, although it tests ok. Does Will have any suggestions?
>
>I thought Will expressed concern about accessing this register?  Why is
>it safe in this context and not before?  It seems from the spec that
>this can still raise an undefined exception if an external debugger
>lowers the software debug enable signal.
>
>> +
>>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
>>  	restore_guest_regs_mode und, #VCPU_UND_REGS
>> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>>   * (hardware reset value is 0) */
>>  .macro set_hdcr operation
>>  	mrc	p15, 4, r2, c1, c1, 1
>> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
>> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>>  	.if \operation == vmentry
>>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>>  	.else
>> -- 
>> 1.7.12.4
>> 
>
>Thanks,
>-Christoffer

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  2015-06-29 19:43     ` Christoffer Dall
@ 2015-07-01  7:09       ` zichao
  -1 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:09 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel



On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
>> As we're about to trap a bunch of CP14 registers, let's rework
>> the CP15 handling so it can be generalized and work with multiple
>> tables.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/coproc.c          | 176
>++++++++++++++++++++++++++---------------
>>  arch/arm/kvm/interrupts_head.S |   2 +-
>>  2 files changed, 112 insertions(+), 66 deletions(-)
>> 
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 9d283d9..d23395b 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>>  };
>>  
>> +static const struct coproc_reg cp14_regs[] = {
>> +};
>> +
>>  /* Target specific emulation tables */
>>  static struct kvm_coproc_target_table
>*target_tables[KVM_ARM_NUM_TARGETS];
>>  
>> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const
>struct coproc_params *params,
>>  	return NULL;
>>  }
>>  
>> -static int emulate_cp15(struct kvm_vcpu *vcpu,
>> -			const struct coproc_params *params)
>> +/*
>> + * emulate_cp --  tries to match a cp14/cp15 access in a handling
>table,
>> + *                and call the corresponding trap handler.
>> + *
>> + * @params: pointer to the descriptor of the access
>> + * @table: array of trap descriptors
>> + * @num: size of the trap descriptor array
>> + *
>> + * Return 0 if the access has been handled, and -1 if not.
>> + */
>> +static int emulate_cp(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *params,
>> +			const struct coproc_reg *table,
>> +			size_t num)
>>  {
>> -	size_t num;
>> -	const struct coproc_reg *table, *r;
>> -
>> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
>> -				   params->CRm, params->Op2, params->is_write);
>> +	const struct coproc_reg *r;
>>  
>> -	table = get_target_table(vcpu->arch.target, &num);
>> +	if (!table)
>> +		return -1;	/* Not handled */
>>  
>> -	/* Search target-specific then generic table. */
>>  	r = find_reg(params, table, num);
>> -	if (!r)
>> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
>>  
>> -	if (likely(r)) {
>> +	if (r) {
>>  		/* If we don't have an accessor, we should never get here! */
>>  		BUG_ON(!r->access);
>>  
>>  		if (likely(r->access(vcpu, params, r))) {
>>  			/* Skip instruction, since it was emulated */
>>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -			return 1;
>>  		}
>> -		/* If access function fails, it should complain. */
>> -	} else {
>> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
>> -			*vcpu_pc(vcpu));
>> -		print_cp_instr(params);
>> +
>> +		/* Handled */
>> +		return 0;
>>  	}
>> +
>> +	/* Not handled */
>> +	return -1;
>> +}
>> +
>> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
>> +				const struct coproc_params *params)
>> +{
>> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
>> +	int cp;
>> +
>> +	switch (hsr_ec) {
>> +	case HSR_EC_CP15_32:
>> +	case HSR_EC_CP15_64:
>> +		cp = 15;
>> +		break;
>> +	case HSR_EC_CP14_MR:
>> +	case HSR_EC_CP14_64:
>> +		cp = 14;
>> +		break;
>> +	default:
>> +		WARN_ON((cp = -1));
>> +	}
>> +
>> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
>> +		cp, *vcpu_pc(vcpu));
>> +	print_cp_instr(params);
>>  	kvm_inject_undefined(vcpu);
>> -	return 1;
>>  }
>>  
>> -/**
>> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
>access
>> - * @vcpu: The VCPU pointer
>> - * @run:  The kvm_run struct
>> - */
>> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
>> +			const struct coproc_reg *global,
>> +			size_t nr_global,
>> +			const struct coproc_reg *target_specific,
>> +			size_t nr_specific)
>>  {
>>  	struct coproc_params params;
>>  
>> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>>  	params.CRm = 0;
>>  
>> -	return emulate_cp15(vcpu, &params);
>> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
>> +		return 1;
>> +	if (!emulate_cp(vcpu, &params, global, nr_global))
>> +		return 1;
>> +
>> +	unhandled_cp_access(vcpu, &params);
>> +	return 1;
>>  }
>>  
>>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
>> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu
>*vcpu,
>>  			table[i].reset(vcpu, &table[i]);
>>  }
>>  
>> -/**
>> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
>access
>> - * @vcpu: The VCPU pointer
>> - * @run:  The kvm_run struct
>> - */
>> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
>> +			const struct coproc_reg *global,
>> +			size_t nr_global,
>> +			const struct coproc_reg *target_specific,
>> +			size_t nr_specific)
>>  {
>>  	struct coproc_params params;
>>  
>> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>>  	params.Rt2 = 0;
>>  
>> -	return emulate_cp15(vcpu, &params);
>> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
>> +		return 1;
>> +	if (!emulate_cp(vcpu, &params, global, nr_global))
>> +		return 1;
>> +
>> +	unhandled_cp_access(vcpu, &params);
>> +	return 1;
>>  }
>>  
>>  /**
>> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
>access
>>   * @vcpu: The VCPU pointer
>>   * @run:  The kvm_run struct
>>   */
>> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	struct coproc_params params;
>> +	const struct coproc_reg *target_specific;
>> +	size_t num;
>>  
>> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> -	params.is_64bit = true;
>> +	target_specific = get_target_table(vcpu->arch.target, &num);
>> +	return kvm_handle_cp_64(vcpu,
>> +				cp15_regs, ARRAY_SIZE(cp15_regs),
>> +				target_specific, num);
>> +}
>>  
>> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
>> -	params.Op2 = 0;
>> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> -	params.CRm = 0;
>> +/**
>> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	const struct coproc_reg *target_specific;
>> +	size_t num;
>>  
>> -	(void)trap_raz_wi(vcpu, &params, NULL);
>> +	target_specific = get_target_table(vcpu->arch.target, &num);
>> +	return kvm_handle_cp_32(vcpu,
>> +				cp15_regs, ARRAY_SIZE(cp15_regs),
>> +				target_specific, num);
>> +}
>>  
>> -	/* handled */
>> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -	return 1;
>> +/**
>> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	return kvm_handle_cp_64(vcpu,
>> +				cp14_regs, ARRAY_SIZE(cp14_regs),
>> +				NULL, 0);
>>  }
>>  
>>  /**
>> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>   */
>>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	struct coproc_params params;
>> -
>> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> -	params.is_64bit = false;
>> -
>> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
>> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>> -	params.Rt2 = 0;
>> -
>> -	(void)trap_raz_wi(vcpu, &params, NULL);
>> -
>> -	/* handled */
>> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -	return 1;
>> +	return kvm_handle_cp_32(vcpu,
>> +				cp14_regs, ARRAY_SIZE(cp14_regs),
>> +				NULL, 0);
>>  }
>>  
>> 
>/******************************************************************************
>> diff --git a/arch/arm/kvm/interrupts_head.S
>b/arch/arm/kvm/interrupts_head.S
>> index f85c447..a20b9ad 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>>   * (hardware reset value is 0) */
>>  .macro set_hdcr operation
>>  	mrc	p15, 4, r2, c1, c1, 1
>> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
>
>why do we stop trapping accesses here?

Because we didn't finish our trap handlers yet, if we keep the trapping enable here, the vm would not run normally as we use unhandled_cp_access in the trap handlers instead of trap_raz_wi.

I enable trapping until everything is ok, in the last patch [11/11].

>
>-Christoffer
>
>>  	.if \operation == vmentry
>>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>>  	.else
>> -- 
>> 1.7.12.4
>> 

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
@ 2015-07-01  7:09       ` zichao
  0 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:09 UTC (permalink / raw)
  To: linux-arm-kernel



On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
>> As we're about to trap a bunch of CP14 registers, let's rework
>> the CP15 handling so it can be generalized and work with multiple
>> tables.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/coproc.c          | 176
>++++++++++++++++++++++++++---------------
>>  arch/arm/kvm/interrupts_head.S |   2 +-
>>  2 files changed, 112 insertions(+), 66 deletions(-)
>> 
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 9d283d9..d23395b 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>>  };
>>  
>> +static const struct coproc_reg cp14_regs[] = {
>> +};
>> +
>>  /* Target specific emulation tables */
>>  static struct kvm_coproc_target_table
>*target_tables[KVM_ARM_NUM_TARGETS];
>>  
>> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const
>struct coproc_params *params,
>>  	return NULL;
>>  }
>>  
>> -static int emulate_cp15(struct kvm_vcpu *vcpu,
>> -			const struct coproc_params *params)
>> +/*
>> + * emulate_cp --  tries to match a cp14/cp15 access in a handling
>table,
>> + *                and call the corresponding trap handler.
>> + *
>> + * @params: pointer to the descriptor of the access
>> + * @table: array of trap descriptors
>> + * @num: size of the trap descriptor array
>> + *
>> + * Return 0 if the access has been handled, and -1 if not.
>> + */
>> +static int emulate_cp(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *params,
>> +			const struct coproc_reg *table,
>> +			size_t num)
>>  {
>> -	size_t num;
>> -	const struct coproc_reg *table, *r;
>> -
>> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
>> -				   params->CRm, params->Op2, params->is_write);
>> +	const struct coproc_reg *r;
>>  
>> -	table = get_target_table(vcpu->arch.target, &num);
>> +	if (!table)
>> +		return -1;	/* Not handled */
>>  
>> -	/* Search target-specific then generic table. */
>>  	r = find_reg(params, table, num);
>> -	if (!r)
>> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
>>  
>> -	if (likely(r)) {
>> +	if (r) {
>>  		/* If we don't have an accessor, we should never get here! */
>>  		BUG_ON(!r->access);
>>  
>>  		if (likely(r->access(vcpu, params, r))) {
>>  			/* Skip instruction, since it was emulated */
>>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -			return 1;
>>  		}
>> -		/* If access function fails, it should complain. */
>> -	} else {
>> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
>> -			*vcpu_pc(vcpu));
>> -		print_cp_instr(params);
>> +
>> +		/* Handled */
>> +		return 0;
>>  	}
>> +
>> +	/* Not handled */
>> +	return -1;
>> +}
>> +
>> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
>> +				const struct coproc_params *params)
>> +{
>> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
>> +	int cp;
>> +
>> +	switch (hsr_ec) {
>> +	case HSR_EC_CP15_32:
>> +	case HSR_EC_CP15_64:
>> +		cp = 15;
>> +		break;
>> +	case HSR_EC_CP14_MR:
>> +	case HSR_EC_CP14_64:
>> +		cp = 14;
>> +		break;
>> +	default:
>> +		WARN_ON((cp = -1));
>> +	}
>> +
>> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
>> +		cp, *vcpu_pc(vcpu));
>> +	print_cp_instr(params);
>>  	kvm_inject_undefined(vcpu);
>> -	return 1;
>>  }
>>  
>> -/**
>> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
>access
>> - * @vcpu: The VCPU pointer
>> - * @run:  The kvm_run struct
>> - */
>> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
>> +			const struct coproc_reg *global,
>> +			size_t nr_global,
>> +			const struct coproc_reg *target_specific,
>> +			size_t nr_specific)
>>  {
>>  	struct coproc_params params;
>>  
>> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>>  	params.CRm = 0;
>>  
>> -	return emulate_cp15(vcpu, &params);
>> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
>> +		return 1;
>> +	if (!emulate_cp(vcpu, &params, global, nr_global))
>> +		return 1;
>> +
>> +	unhandled_cp_access(vcpu, &params);
>> +	return 1;
>>  }
>>  
>>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
>> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu
>*vcpu,
>>  			table[i].reset(vcpu, &table[i]);
>>  }
>>  
>> -/**
>> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
>access
>> - * @vcpu: The VCPU pointer
>> - * @run:  The kvm_run struct
>> - */
>> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
>> +			const struct coproc_reg *global,
>> +			size_t nr_global,
>> +			const struct coproc_reg *target_specific,
>> +			size_t nr_specific)
>>  {
>>  	struct coproc_params params;
>>  
>> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>>  	params.Rt2 = 0;
>>  
>> -	return emulate_cp15(vcpu, &params);
>> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
>> +		return 1;
>> +	if (!emulate_cp(vcpu, &params, global, nr_global))
>> +		return 1;
>> +
>> +	unhandled_cp_access(vcpu, &params);
>> +	return 1;
>>  }
>>  
>>  /**
>> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
>access
>>   * @vcpu: The VCPU pointer
>>   * @run:  The kvm_run struct
>>   */
>> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	struct coproc_params params;
>> +	const struct coproc_reg *target_specific;
>> +	size_t num;
>>  
>> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> -	params.is_64bit = true;
>> +	target_specific = get_target_table(vcpu->arch.target, &num);
>> +	return kvm_handle_cp_64(vcpu,
>> +				cp15_regs, ARRAY_SIZE(cp15_regs),
>> +				target_specific, num);
>> +}
>>  
>> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
>> -	params.Op2 = 0;
>> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> -	params.CRm = 0;
>> +/**
>> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	const struct coproc_reg *target_specific;
>> +	size_t num;
>>  
>> -	(void)trap_raz_wi(vcpu, &params, NULL);
>> +	target_specific = get_target_table(vcpu->arch.target, &num);
>> +	return kvm_handle_cp_32(vcpu,
>> +				cp15_regs, ARRAY_SIZE(cp15_regs),
>> +				target_specific, num);
>> +}
>>  
>> -	/* handled */
>> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -	return 1;
>> +/**
>> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
>access
>> + * @vcpu: The VCPU pointer
>> + * @run:  The kvm_run struct
>> + */
>> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +	return kvm_handle_cp_64(vcpu,
>> +				cp14_regs, ARRAY_SIZE(cp14_regs),
>> +				NULL, 0);
>>  }
>>  
>>  /**
>> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu,
>struct kvm_run *run)
>>   */
>>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	struct coproc_params params;
>> -
>> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
>> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
>> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
>> -	params.is_64bit = false;
>> -
>> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
>> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
>> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
>> -	params.Rt2 = 0;
>> -
>> -	(void)trap_raz_wi(vcpu, &params, NULL);
>> -
>> -	/* handled */
>> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> -	return 1;
>> +	return kvm_handle_cp_32(vcpu,
>> +				cp14_regs, ARRAY_SIZE(cp14_regs),
>> +				NULL, 0);
>>  }
>>  
>> 
>/******************************************************************************
>> diff --git a/arch/arm/kvm/interrupts_head.S
>b/arch/arm/kvm/interrupts_head.S
>> index f85c447..a20b9ad 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
>>   * (hardware reset value is 0) */
>>  .macro set_hdcr operation
>>  	mrc	p15, 4, r2, c1, c1, 1
>> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
>> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
>
>why do we stop trapping accesses here?

Because we didn't finish our trap handlers yet, if we keep the trapping enable here, the vm would not run normally as we use unhandled_cp_access in the trap handlers instead of trap_raz_wi.

I enable trapping until everything is ok, in the last patch [11/11].

>
>-Christoffer
>
>>  	.if \operation == vmentry
>>  	orr	r2, r2, r3		@ Trap some perfmon accesses
>>  	.else
>> -- 
>> 1.7.12.4
>> 

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
  2015-06-29 21:16     ` Christoffer Dall
@ 2015-07-01  7:14       ` zichao
  -1 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:14 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel



On June 30, 2015 5:16:41 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:29PM +0800, Zhichao Huang wrote:
>> Add handlers for all the 32-bit debug registers.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |  12 ++++
>>  arch/arm/include/asm/kvm_host.h |   3 +
>>  arch/arm/kernel/asm-offsets.c   |   1 +
>>  arch/arm/kvm/coproc.c           | 122
>++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 138 insertions(+)
>> 
>> diff --git a/arch/arm/include/asm/kvm_asm.h
>b/arch/arm/include/asm/kvm_asm.h
>> index 25410b2..ba65e05 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -52,6 +52,18 @@
>>  #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1
>*/
>>  #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
>>  
>> +/* 0 is reserved as an invalid value. */
>> +#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15)
>*/
>> +#define cp14_DBGBVR15	16
>> +#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15)
>*/
>> +#define cp14_DBGBCR15	32
>> +#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15)
>*/
>> +#define cp14_DBGWVR15	48
>> +#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15)
>*/
>> +#define cp14_DBGWCR15	64
>> +#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>> +#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>> +
>>  #define ARM_EXCEPTION_RESET	  0
>>  #define ARM_EXCEPTION_UNDEFINED   1
>>  #define ARM_EXCEPTION_SOFTWARE    2
>> diff --git a/arch/arm/include/asm/kvm_host.h
>b/arch/arm/include/asm/kvm_host.h
>> index d71607c..3d16820 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
>>  	struct vgic_cpu vgic_cpu;
>>  	struct arch_timer_cpu timer_cpu;
>>  
>> +	/* System control coprocessor (cp14) */
>> +	u32 cp14[NR_CP14_REGS];
>> +
>>  	/*
>>  	 * Anything that is not used directly from assembly code goes
>>  	 * here.
>> diff --git a/arch/arm/kernel/asm-offsets.c
>b/arch/arm/kernel/asm-offsets.c
>> index 871b826..9158de0 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -172,6 +172,7 @@ int main(void)
>>  #ifdef CONFIG_KVM_ARM_HOST
>>    DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
>>    DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
>> +  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
>>    DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
>>    DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
>>    DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu,
>arch.host_cpu_context));
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 16d5f69..59b65b7 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>>  	return true;
>>  }
>>  
>> +static bool trap_debug32(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	if (p->is_write)
>> +		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
>> +	else
>> +		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +
>> +	return true;
>> +}
>> +
>> +/* DBGIDR (RO) Debug ID */
>> +static bool trap_dbgidr(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	u32 val;
>> +
>> +	if (p->is_write)
>> +		return ignore_write(vcpu, p);
>> +
>> +	ARM_DBG_READ(c0, c0, 0, val);
>> +	*vcpu_reg(vcpu, p->Rt1) = val;
>> +
>> +	return true;
>> +}
>> +
>> +/* DBGDSCRint (RO) Debug Status and Control Register */
>> +static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	if (p->is_write)
>> +		return ignore_write(vcpu, p);
>> +
>> +	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +
>> +	return true;
>> +}
>> +
>>  /*
>>   * We could trap ID_DFR0 and tell the guest we don't support
>performance
>>   * monitoring.  Unfortunately the patch to make the kernel check
>ID_DFR0 was
>> @@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>>  };
>>  
>> +#define DBG_BCR_BVR_WCR_WVR(n)					\
>> +	/* DBGBVRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
>> +	/* DBGBCRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
>> +	/* DBGWVRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
>> +	/* DBGWCRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
>> +
>> +/* No OS DBGBXVR machanism implemented. */
>> +#define DBGBXVR(n)						\
>> +	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>> +
>> +/*
>> + * Trapped cp14 registers. We generally ignore most of the external
>> + * debug, on the principle that they don't really make sense to a
>> + * guest. Revisit this one day, whould this principle change.
>
>s/whould/should/
>

ok.

>> + */
>>  static const struct coproc_reg cp14_regs[] = {
>> +	/* DBGIDR */
>> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
>> +	/* DBGDTRRXext */
>> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>> +	DBG_BCR_BVR_WCR_WVR(0),
>> +	/* DBGDSCRint */
>> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>> +				NULL, cp14_DBGDSCRext },
>> +	DBG_BCR_BVR_WCR_WVR(1),
>> +	/* DBGDSCRext */
>> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>> +				reset_val, cp14_DBGDSCRext, 0 },
>> +	DBG_BCR_BVR_WCR_WVR(2),
>> +	/* DBGDTRRXext */
>> +	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },
>
>isn't this the DBGDTRTXext register?

Thanks for pointing out.
>
>> +	DBG_BCR_BVR_WCR_WVR(3),
>> +	DBG_BCR_BVR_WCR_WVR(4),
>> +	/* DBGDTR[RT]Xint */
>> +	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
>> +	DBG_BCR_BVR_WCR_WVR(5),
>> +	DBG_BCR_BVR_WCR_WVR(6),
>> +	/* DBGVCR */
>> +	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
>> +	DBG_BCR_BVR_WCR_WVR(7),
>> +	DBG_BCR_BVR_WCR_WVR(8),
>> +	DBG_BCR_BVR_WCR_WVR(9),
>> +	DBG_BCR_BVR_WCR_WVR(10),
>> +	DBG_BCR_BVR_WCR_WVR(11),
>> +	DBG_BCR_BVR_WCR_WVR(12),
>> +	DBG_BCR_BVR_WCR_WVR(13),
>> +	DBG_BCR_BVR_WCR_WVR(14),
>> +	DBG_BCR_BVR_WCR_WVR(15),
>> +
>> +	DBGBXVR(0),
>> +	/* DBGOSLAR */
>> +	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(1),
>> +	/* DBGOSLSR */
>> +	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(2),
>> +	DBGBXVR(3),
>> +	/* DBGOSDLRd */
>> +	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(4),
>> +	DBGBXVR(5),
>> +	/* DBGPRSRa */
>> +	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +
>> +	DBGBXVR(6),
>> +	DBGBXVR(7),
>> +	DBGBXVR(8),
>> +	DBGBXVR(9),
>> +	DBGBXVR(10),
>> +	DBGBXVR(11),
>> +	DBGBXVR(12),
>> +	DBGBXVR(13),
>> +	DBGBXVR(14),
>> +	DBGBXVR(15),
>>  };
>>  
>>  /* Target specific emulation tables */
>> -- 
>> 1.7.12.4
>> 
>Otherwise this looks ok,
>-Christoffer

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
@ 2015-07-01  7:14       ` zichao
  0 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-01  7:14 UTC (permalink / raw)
  To: linux-arm-kernel



On June 30, 2015 5:16:41 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:29PM +0800, Zhichao Huang wrote:
>> Add handlers for all the 32-bit debug registers.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |  12 ++++
>>  arch/arm/include/asm/kvm_host.h |   3 +
>>  arch/arm/kernel/asm-offsets.c   |   1 +
>>  arch/arm/kvm/coproc.c           | 122
>++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 138 insertions(+)
>> 
>> diff --git a/arch/arm/include/asm/kvm_asm.h
>b/arch/arm/include/asm/kvm_asm.h
>> index 25410b2..ba65e05 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -52,6 +52,18 @@
>>  #define c10_AMAIR1	30	/* Auxilary Memory Attribute Indirection Reg1
>*/
>>  #define NR_CP15_REGS	31	/* Number of regs (incl. invalid) */
>>  
>> +/* 0 is reserved as an invalid value. */
>> +#define cp14_DBGBVR0	1	/* Debug Breakpoint Control Registers (0-15)
>*/
>> +#define cp14_DBGBVR15	16
>> +#define cp14_DBGBCR0	17	/* Debug Breakpoint Value Registers (0-15)
>*/
>> +#define cp14_DBGBCR15	32
>> +#define cp14_DBGWVR0	33	/* Debug Watchpoint Control Registers (0-15)
>*/
>> +#define cp14_DBGWVR15	48
>> +#define cp14_DBGWCR0	49	/* Debug Watchpoint Value Registers (0-15)
>*/
>> +#define cp14_DBGWCR15	64
>> +#define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>> +#define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>> +
>>  #define ARM_EXCEPTION_RESET	  0
>>  #define ARM_EXCEPTION_UNDEFINED   1
>>  #define ARM_EXCEPTION_SOFTWARE    2
>> diff --git a/arch/arm/include/asm/kvm_host.h
>b/arch/arm/include/asm/kvm_host.h
>> index d71607c..3d16820 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -124,6 +124,9 @@ struct kvm_vcpu_arch {
>>  	struct vgic_cpu vgic_cpu;
>>  	struct arch_timer_cpu timer_cpu;
>>  
>> +	/* System control coprocessor (cp14) */
>> +	u32 cp14[NR_CP14_REGS];
>> +
>>  	/*
>>  	 * Anything that is not used directly from assembly code goes
>>  	 * here.
>> diff --git a/arch/arm/kernel/asm-offsets.c
>b/arch/arm/kernel/asm-offsets.c
>> index 871b826..9158de0 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -172,6 +172,7 @@ int main(void)
>>  #ifdef CONFIG_KVM_ARM_HOST
>>    DEFINE(VCPU_KVM,		offsetof(struct kvm_vcpu, kvm));
>>    DEFINE(VCPU_MIDR,		offsetof(struct kvm_vcpu, arch.midr));
>> +  DEFINE(VCPU_CP14,		offsetof(struct kvm_vcpu, arch.cp14));
>>    DEFINE(VCPU_CP15,		offsetof(struct kvm_vcpu, arch.cp15));
>>    DEFINE(VCPU_VFP_GUEST,	offsetof(struct kvm_vcpu, arch.vfp_guest));
>>    DEFINE(VCPU_VFP_HOST,		offsetof(struct kvm_vcpu,
>arch.host_cpu_context));
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 16d5f69..59b65b7 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>>  	return true;
>>  }
>>  
>> +static bool trap_debug32(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	if (p->is_write)
>> +		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
>> +	else
>> +		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +
>> +	return true;
>> +}
>> +
>> +/* DBGIDR (RO) Debug ID */
>> +static bool trap_dbgidr(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	u32 val;
>> +
>> +	if (p->is_write)
>> +		return ignore_write(vcpu, p);
>> +
>> +	ARM_DBG_READ(c0, c0, 0, val);
>> +	*vcpu_reg(vcpu, p->Rt1) = val;
>> +
>> +	return true;
>> +}
>> +
>> +/* DBGDSCRint (RO) Debug Status and Control Register */
>> +static bool trap_dbgdscr(struct kvm_vcpu *vcpu,
>> +			const struct coproc_params *p,
>> +			const struct coproc_reg *r)
>> +{
>> +	if (p->is_write)
>> +		return ignore_write(vcpu, p);
>> +
>> +	*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +
>> +	return true;
>> +}
>> +
>>  /*
>>   * We could trap ID_DFR0 and tell the guest we don't support
>performance
>>   * monitoring.  Unfortunately the patch to make the kernel check
>ID_DFR0 was
>> @@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
>>  };
>>  
>> +#define DBG_BCR_BVR_WCR_WVR(n)					\
>> +	/* DBGBVRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGBVR0 + (n)), 0 },	\
>> +	/* DBGBCRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGBCR0 + (n)), 0 },	\
>> +	/* DBGWVRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 6), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGWVR0 + (n)), 0 },	\
>> +	/* DBGWCRn */						\
>> +	{ CRn( 0), CRm((n)), Op1( 0), Op2( 7), is32,		\
>> +	  trap_debug32,	reset_val, (cp14_DBGWCR0 + (n)), 0 }
>> +
>> +/* No OS DBGBXVR machanism implemented. */
>> +#define DBGBXVR(n)						\
>> +	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>> +
>> +/*
>> + * Trapped cp14 registers. We generally ignore most of the external
>> + * debug, on the principle that they don't really make sense to a
>> + * guest. Revisit this one day, whould this principle change.
>
>s/whould/should/
>

ok.

>> + */
>>  static const struct coproc_reg cp14_regs[] = {
>> +	/* DBGIDR */
>> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 0), is32, trap_dbgidr},
>> +	/* DBGDTRRXext */
>> +	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>> +	DBG_BCR_BVR_WCR_WVR(0),
>> +	/* DBGDSCRint */
>> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>> +				NULL, cp14_DBGDSCRext },
>> +	DBG_BCR_BVR_WCR_WVR(1),
>> +	/* DBGDSCRext */
>> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>> +				reset_val, cp14_DBGDSCRext, 0 },
>> +	DBG_BCR_BVR_WCR_WVR(2),
>> +	/* DBGDTRRXext */
>> +	{ CRn( 0), CRm( 3), Op1( 0), Op2( 2), is32, trap_raz_wi },
>
>isn't this the DBGDTRTXext register?

Thanks for pointing out.
>
>> +	DBG_BCR_BVR_WCR_WVR(3),
>> +	DBG_BCR_BVR_WCR_WVR(4),
>> +	/* DBGDTR[RT]Xint */
>> +	{ CRn( 0), CRm( 5), Op1( 0), Op2( 0), is32, trap_raz_wi },
>> +	DBG_BCR_BVR_WCR_WVR(5),
>> +	DBG_BCR_BVR_WCR_WVR(6),
>> +	/* DBGVCR */
>> +	{ CRn( 0), CRm( 7), Op1( 0), Op2( 0), is32, trap_debug32 },
>> +	DBG_BCR_BVR_WCR_WVR(7),
>> +	DBG_BCR_BVR_WCR_WVR(8),
>> +	DBG_BCR_BVR_WCR_WVR(9),
>> +	DBG_BCR_BVR_WCR_WVR(10),
>> +	DBG_BCR_BVR_WCR_WVR(11),
>> +	DBG_BCR_BVR_WCR_WVR(12),
>> +	DBG_BCR_BVR_WCR_WVR(13),
>> +	DBG_BCR_BVR_WCR_WVR(14),
>> +	DBG_BCR_BVR_WCR_WVR(15),
>> +
>> +	DBGBXVR(0),
>> +	/* DBGOSLAR */
>> +	{ CRn( 1), CRm( 0), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(1),
>> +	/* DBGOSLSR */
>> +	{ CRn( 1), CRm( 1), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(2),
>> +	DBGBXVR(3),
>> +	/* DBGOSDLRd */
>> +	{ CRn( 1), CRm( 3), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +	DBGBXVR(4),
>> +	DBGBXVR(5),
>> +	/* DBGPRSRa */
>> +	{ CRn( 1), CRm( 5), Op1( 0), Op2( 4), is32, trap_raz_wi },
>> +
>> +	DBGBXVR(6),
>> +	DBGBXVR(7),
>> +	DBGBXVR(8),
>> +	DBGBXVR(9),
>> +	DBGBXVR(10),
>> +	DBGBXVR(11),
>> +	DBGBXVR(12),
>> +	DBGBXVR(13),
>> +	DBGBXVR(14),
>> +	DBGBXVR(15),
>>  };
>>  
>>  /* Target specific emulation tables */
>> -- 
>> 1.7.12.4
>> 
>Otherwise this looks ok,
>-Christoffer

-- 
zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
  2015-06-30 13:20     ` Christoffer Dall
@ 2015-07-01  7:43       ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-01  7:43 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao



On June 30, 2015 9:20:29 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:30PM +0800, Zhichao Huang wrote:
>> Add handlers for all the 64-bit debug registers.
>> 
>> There is an overlap between 32 and 64bit registers. Make sure that
>> 64-bit registers preceding 32-bit ones.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/coproc.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>> 
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 59b65b7..eeee648 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>>  
>>  /*
>> + * Architected CP14 registers.
>> + *
>
>belongs in other patch?

OK, I will move it to the patch [06/11].
>
>>   * Trapped cp14 registers. We generally ignore most of the external
>>   * debug, on the principle that they don't really make sense to a
>>   * guest. Revisit this one day, whould this principle change.
>> + *
>> + * CRn denotes the primary register number, but is copied to the CRm in the
>> + * user space API for 64-bit register access in line with the terminology used
>> + * in the ARM ARM.
>> + * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and
>with 64-bit
>> + *            registers preceding 32-bit ones.
>>   */
>>  static const struct coproc_reg cp14_regs[] = {
>>  	/* DBGIDR */
>> @@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
>>  	/* DBGDTRRXext */
>>  	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>>  	DBG_BCR_BVR_WCR_WVR(0),
>> +	/* DBGDRAR (64bit) */
>> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
>>  	/* DBGDSCRint */
>>  	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>>  				NULL, cp14_DBGDSCRext },
>>  	DBG_BCR_BVR_WCR_WVR(1),
>> +	/* DBGDSAR (64bit) */
>> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
>>  	/* DBGDSCRext */
>>  	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>>  				reset_val, cp14_DBGDSCRext, 0 },
>> -- 
>> 1.7.12.4
>> 
>Otherwise:
>Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
@ 2015-07-01  7:43       ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-01  7:43 UTC (permalink / raw)
  To: linux-arm-kernel



On June 30, 2015 9:20:29 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:30PM +0800, Zhichao Huang wrote:
>> Add handlers for all the 64-bit debug registers.
>> 
>> There is an overlap between 32 and 64bit registers. Make sure that
>> 64-bit registers preceding 32-bit ones.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/coproc.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>> 
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index 59b65b7..eeee648 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = {
>>  	{ CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi }
>>  
>>  /*
>> + * Architected CP14 registers.
>> + *
>
>belongs in other patch?

OK, I will move it to the patch [06/11].
>
>>   * Trapped cp14 registers. We generally ignore most of the external
>>   * debug, on the principle that they don't really make sense to a
>>   * guest. Revisit this one day, whould this principle change.
>> + *
>> + * CRn denotes the primary register number, but is copied to the CRm in the
>> + * user space API for 64-bit register access in line with the terminology used
>> + * in the ARM ARM.
>> + * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and
>with 64-bit
>> + *            registers preceding 32-bit ones.
>>   */
>>  static const struct coproc_reg cp14_regs[] = {
>>  	/* DBGIDR */
>> @@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = {
>>  	/* DBGDTRRXext */
>>  	{ CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi },
>>  	DBG_BCR_BVR_WCR_WVR(0),
>> +	/* DBGDRAR (64bit) */
>> +	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi},
>>  	/* DBGDSCRint */
>>  	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr,
>>  				NULL, cp14_DBGDSCRext },
>>  	DBG_BCR_BVR_WCR_WVR(1),
>> +	/* DBGDSAR (64bit) */
>> +	{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi},
>>  	/* DBGDSCRext */
>>  	{ CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32,
>>  				reset_val, cp14_DBGDSCRext, 0 },
>> -- 
>> 1.7.12.4
>> 
>Otherwise:
>Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
  2015-07-01  7:04       ` zichao
  (?)
@ 2015-07-01  9:00         ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-01  9:00 UTC (permalink / raw)
  To: zichao
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao, stable

On Wed, Jul 01, 2015 at 03:04:00PM +0800, zichao wrote:
> 
> 
> On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
> >> Hardware debugging in guests is not intercepted currently, it means
> >> that a malicious guest can bring down the entire machine by writing
> >> to the debug registers.
> >> 
> >> This patch enable trapping of all debug registers, preventing the
> >guests
> >> to access the debug registers.
> >> 
> >> This patch also disable the debug mode(DBGDSCR) in the guest world
> >all
> >> the time, preventing the guests to mess with the host state.
> >> 
> >> However, it is a precursor for later patches which will need to do
> >> more to world switch debug states while necessary.
> >> 
> >> Cc: <stable@vger.kernel.org>
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_coproc.h |  3 +-
> >>  arch/arm/kvm/coproc.c             | 60
> >+++++++++++++++++++++++++++++++++++----
> >>  arch/arm/kvm/handle_exit.c        |  4 +--
> >>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
> >>  4 files changed, 70 insertions(+), 10 deletions(-)
> >> 
> >> diff --git a/arch/arm/include/asm/kvm_coproc.h
> >b/arch/arm/include/asm/kvm_coproc.h
> >> index 4917c2f..e74ab0f 100644
> >> --- a/arch/arm/include/asm/kvm_coproc.h
> >> +++ b/arch/arm/include/asm/kvm_coproc.h
> >> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct
> >kvm_coproc_target_table *table);
> >>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index f3d88dc..2e12760 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu
> >*vcpu, struct kvm_run *run)
> >>  	return 1;
> >>  }
> >>  
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run)
> >> -{
> >> -	kvm_inject_undefined(vcpu);
> >> -	return 1;
> >> -}
> >> -
> >>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct
> >coproc_reg *r)
> >>  {
> >>  	/*
> >> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	return emulate_cp15(vcpu, &params);
> >>  }
> >>  
> >> +/**
> >> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = true;
> >> +
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> >> +	params.Op2 = 0;
> >> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.CRm = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_64, can you share this
> >code somehow?
> >
> 
> This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11].
> 
> Should I take the patch [04/11] ahead of current patch [01/11] ?
> 

It would be good if the patch that we can cc stable and which fixes the
issue is self-contained.  If it's impossible to do that while sharing
the handlers (I don't see why, but I didn't write the code) then ok, but
otherwise just add that bit of code into this patch I would say.

> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> +/**
> >> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = false;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> >> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >> +	params.Rt2 = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_32, can you share this
> >code somehow?
> >
> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> 
> >/******************************************************************************
> >>   * Userspace API
> >>  
> >*****************************************************************************/
> >> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> >> index 95f12b2..357ad1b 100644
> >> --- a/arch/arm/kvm/handle_exit.c
> >> +++ b/arch/arm/kvm/handle_exit.c
> >> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
> >>  	[HSR_EC_WFI]		= kvm_handle_wfx,
> >>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
> >>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
> >> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
> >>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
> >> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
> >>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
> >>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
> >>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
> >> diff --git a/arch/arm/kvm/interrupts_head.S
> >b/arch/arm/kvm/interrupts_head.S
> >> index 35e4a3a..f85c447 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	mrs	r8, LR_fiq
> >>  	mrs	r9, SPSR_fiq
> >>  	push	{r2-r9}
> >> +
> >> +	/* DBGDSCR reg */
> >> +	mrc	p14, 0, r2, c0, c1, 0
> >> +	push	{r2}
> >
> >this feels like it should belong in read_cp15_state and not the gp regs
> >portion ?
> >
> 
> Happy to move it. But moving the cp14 regs to read/write_cp15_state still seems no very appropriate. Should I move it to __kvm_vcpu_return and __kvm_vcpu_run?

you should probably rename read_cp15_state to read_coproc_state then.

> 
> Another reason might be that, I want to disable debug mode (DBGDSCR) as early as possible.
> 

Why?  The world-switch code is atomic in that sense is it not?

> >
> >>  .endm
> >>  
> >>  .macro pop_host_regs_mode mode
> >> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers all registers, in all modes, except r0 and r1.
> >>   */
> >>  .macro restore_host_regs
> >> +	pop	{r2}
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >> +
> >
> >Why are we reading the DBGDSCRint and writing the DBGDSCRext ?
> 
> Because the DBGDSCRint is read-only, and I borrowed the operation from kernel.
> 
> arch/arm/kernel/hw_breakpoint.c:
> ARM_DBG_READ(c0, c1, 0, dscr)
> ARM_DBG_WRITE(c0, c2, 2, dscr)
> >
> >>  	pop	{r2-r9}
> >>  	msr	r8_fiq, r2
> >>  	msr	r9_fiq, r3
> >> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers *all* registers.
> >>   */
> >>  .macro restore_guest_regs
> >> +	/* reset DBGDSCR to disable debug mode */
> >> +	mov	r2, #0
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >
> >Is it valid to write 0 in all all fields of this register?
> 
> I'm afraid of it too, although it tests ok. Does Will have any suggestions?
> >
> >I thought Will expressed concern about accessing this register?  Why is
> >it safe in this context and not before?  It seems from the spec that
> >this can still raise an undefined exception if an external debugger
> >lowers the software debug enable signal.
> >
> >> +
> >>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> >>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
> >>  	restore_guest_regs_mode und, #VCPU_UND_REGS
> >> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
> >>   * (hardware reset value is 0) */
> >>  .macro set_hdcr operation
> >>  	mrc	p15, 4, r2, c1, c1, 1
> >> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> >> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> >>  	.if \operation == vmentry
> >>  	orr	r2, r2, r3		@ Trap some perfmon accesses
> >>  	.else
> >> -- 
> >> 1.7.12.4
> >> 
> >
> >Thanks,
> >-Christoffer
> 
> -- 
> zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-07-01  9:00         ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-01  9:00 UTC (permalink / raw)
  To: zichao
  Cc: kvm, marc.zyngier, will.deacon, stable, huangzhichao, kvmarm,
	linux-arm-kernel

On Wed, Jul 01, 2015 at 03:04:00PM +0800, zichao wrote:
> 
> 
> On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
> >> Hardware debugging in guests is not intercepted currently, it means
> >> that a malicious guest can bring down the entire machine by writing
> >> to the debug registers.
> >> 
> >> This patch enable trapping of all debug registers, preventing the
> >guests
> >> to access the debug registers.
> >> 
> >> This patch also disable the debug mode(DBGDSCR) in the guest world
> >all
> >> the time, preventing the guests to mess with the host state.
> >> 
> >> However, it is a precursor for later patches which will need to do
> >> more to world switch debug states while necessary.
> >> 
> >> Cc: <stable@vger.kernel.org>
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_coproc.h |  3 +-
> >>  arch/arm/kvm/coproc.c             | 60
> >+++++++++++++++++++++++++++++++++++----
> >>  arch/arm/kvm/handle_exit.c        |  4 +--
> >>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
> >>  4 files changed, 70 insertions(+), 10 deletions(-)
> >> 
> >> diff --git a/arch/arm/include/asm/kvm_coproc.h
> >b/arch/arm/include/asm/kvm_coproc.h
> >> index 4917c2f..e74ab0f 100644
> >> --- a/arch/arm/include/asm/kvm_coproc.h
> >> +++ b/arch/arm/include/asm/kvm_coproc.h
> >> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct
> >kvm_coproc_target_table *table);
> >>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index f3d88dc..2e12760 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu
> >*vcpu, struct kvm_run *run)
> >>  	return 1;
> >>  }
> >>  
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run)
> >> -{
> >> -	kvm_inject_undefined(vcpu);
> >> -	return 1;
> >> -}
> >> -
> >>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct
> >coproc_reg *r)
> >>  {
> >>  	/*
> >> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	return emulate_cp15(vcpu, &params);
> >>  }
> >>  
> >> +/**
> >> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = true;
> >> +
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> >> +	params.Op2 = 0;
> >> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.CRm = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_64, can you share this
> >code somehow?
> >
> 
> This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11].
> 
> Should I take the patch [04/11] ahead of current patch [01/11] ?
> 

It would be good if the patch that we can cc stable and which fixes the
issue is self-contained.  If it's impossible to do that while sharing
the handlers (I don't see why, but I didn't write the code) then ok, but
otherwise just add that bit of code into this patch I would say.

> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> +/**
> >> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = false;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> >> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >> +	params.Rt2 = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_32, can you share this
> >code somehow?
> >
> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> 
> >/******************************************************************************
> >>   * Userspace API
> >>  
> >*****************************************************************************/
> >> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> >> index 95f12b2..357ad1b 100644
> >> --- a/arch/arm/kvm/handle_exit.c
> >> +++ b/arch/arm/kvm/handle_exit.c
> >> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
> >>  	[HSR_EC_WFI]		= kvm_handle_wfx,
> >>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
> >>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
> >> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
> >>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
> >> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
> >>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
> >>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
> >>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
> >> diff --git a/arch/arm/kvm/interrupts_head.S
> >b/arch/arm/kvm/interrupts_head.S
> >> index 35e4a3a..f85c447 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	mrs	r8, LR_fiq
> >>  	mrs	r9, SPSR_fiq
> >>  	push	{r2-r9}
> >> +
> >> +	/* DBGDSCR reg */
> >> +	mrc	p14, 0, r2, c0, c1, 0
> >> +	push	{r2}
> >
> >this feels like it should belong in read_cp15_state and not the gp regs
> >portion ?
> >
> 
> Happy to move it. But moving the cp14 regs to read/write_cp15_state still seems no very appropriate. Should I move it to __kvm_vcpu_return and __kvm_vcpu_run?

you should probably rename read_cp15_state to read_coproc_state then.

> 
> Another reason might be that, I want to disable debug mode (DBGDSCR) as early as possible.
> 

Why?  The world-switch code is atomic in that sense is it not?

> >
> >>  .endm
> >>  
> >>  .macro pop_host_regs_mode mode
> >> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers all registers, in all modes, except r0 and r1.
> >>   */
> >>  .macro restore_host_regs
> >> +	pop	{r2}
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >> +
> >
> >Why are we reading the DBGDSCRint and writing the DBGDSCRext ?
> 
> Because the DBGDSCRint is read-only, and I borrowed the operation from kernel.
> 
> arch/arm/kernel/hw_breakpoint.c:
> ARM_DBG_READ(c0, c1, 0, dscr)
> ARM_DBG_WRITE(c0, c2, 2, dscr)
> >
> >>  	pop	{r2-r9}
> >>  	msr	r8_fiq, r2
> >>  	msr	r9_fiq, r3
> >> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers *all* registers.
> >>   */
> >>  .macro restore_guest_regs
> >> +	/* reset DBGDSCR to disable debug mode */
> >> +	mov	r2, #0
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >
> >Is it valid to write 0 in all all fields of this register?
> 
> I'm afraid of it too, although it tests ok. Does Will have any suggestions?
> >
> >I thought Will expressed concern about accessing this register?  Why is
> >it safe in this context and not before?  It seems from the spec that
> >this can still raise an undefined exception if an external debugger
> >lowers the software debug enable signal.
> >
> >> +
> >>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> >>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
> >>  	restore_guest_regs_mode und, #VCPU_UND_REGS
> >> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
> >>   * (hardware reset value is 0) */
> >>  .macro set_hdcr operation
> >>  	mrc	p15, 4, r2, c1, c1, 1
> >> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> >> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> >>  	.if \operation == vmentry
> >>  	orr	r2, r2, r3		@ Trap some perfmon accesses
> >>  	.else
> >> -- 
> >> 1.7.12.4
> >> 
> >
> >Thanks,
> >-Christoffer
> 
> -- 
> zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/11] KVM: arm: plug guest debug exploit
@ 2015-07-01  9:00         ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-01  9:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 01, 2015 at 03:04:00PM +0800, zichao wrote:
> 
> 
> On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote:
> >> Hardware debugging in guests is not intercepted currently, it means
> >> that a malicious guest can bring down the entire machine by writing
> >> to the debug registers.
> >> 
> >> This patch enable trapping of all debug registers, preventing the
> >guests
> >> to access the debug registers.
> >> 
> >> This patch also disable the debug mode(DBGDSCR) in the guest world
> >all
> >> the time, preventing the guests to mess with the host state.
> >> 
> >> However, it is a precursor for later patches which will need to do
> >> more to world switch debug states while necessary.
> >> 
> >> Cc: <stable@vger.kernel.org>
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_coproc.h |  3 +-
> >>  arch/arm/kvm/coproc.c             | 60
> >+++++++++++++++++++++++++++++++++++----
> >>  arch/arm/kvm/handle_exit.c        |  4 +--
> >>  arch/arm/kvm/interrupts_head.S    | 13 ++++++++-
> >>  4 files changed, 70 insertions(+), 10 deletions(-)
> >> 
> >> diff --git a/arch/arm/include/asm/kvm_coproc.h
> >b/arch/arm/include/asm/kvm_coproc.h
> >> index 4917c2f..e74ab0f 100644
> >> --- a/arch/arm/include/asm/kvm_coproc.h
> >> +++ b/arch/arm/include/asm/kvm_coproc.h
> >> @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct
> >kvm_coproc_target_table *table);
> >>  int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >>  int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run);
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> >>  
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index f3d88dc..2e12760 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu
> >*vcpu, struct kvm_run *run)
> >>  	return 1;
> >>  }
> >>  
> >> -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run
> >*run)
> >> -{
> >> -	kvm_inject_undefined(vcpu);
> >> -	return 1;
> >> -}
> >> -
> >>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct
> >coproc_reg *r)
> >>  {
> >>  	/*
> >> @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	return emulate_cp15(vcpu, &params);
> >>  }
> >>  
> >> +/**
> >> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = true;
> >> +
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> >> +	params.Op2 = 0;
> >> +	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.CRm = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_64, can you share this
> >code somehow?
> >
> 
> This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11].
> 
> Should I take the patch [04/11] ahead of current patch [01/11] ?
> 

It would be good if the patch that we can cc stable and which fixes the
issue is self-contained.  If it's impossible to do that while sharing
the handlers (I don't see why, but I didn't write the code) then ok, but
otherwise just add that bit of code into this patch I would say.

> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> +/**
> >> + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	struct coproc_params params;
> >> +
> >> +	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> +	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> +	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> +	params.is_64bit = false;
> >> +
> >> +	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> +	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> >> +	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >> +	params.Rt2 = 0;
> >
> >this is a complete duplicate of kvm_handle_cp15_32, can you share this
> >code somehow?
> >
> >> +
> >> +	/* raz_wi */
> >> +	(void)pm_fake(vcpu, &params, NULL);
> >> +
> >> +	/* handled */
> >> +	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> +	return 1;
> >> +}
> >> +
> >> 
> >/******************************************************************************
> >>   * Userspace API
> >>  
> >*****************************************************************************/
> >> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> >> index 95f12b2..357ad1b 100644
> >> --- a/arch/arm/kvm/handle_exit.c
> >> +++ b/arch/arm/kvm/handle_exit.c
> >> @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = {
> >>  	[HSR_EC_WFI]		= kvm_handle_wfx,
> >>  	[HSR_EC_CP15_32]	= kvm_handle_cp15_32,
> >>  	[HSR_EC_CP15_64]	= kvm_handle_cp15_64,
> >> -	[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,
> >>  	[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,
> >> -	[HSR_EC_CP14_64]	= kvm_handle_cp14_access,
> >> +	[HSR_EC_CP14_64]	= kvm_handle_cp14_64,
> >>  	[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,
> >>  	[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,
> >>  	[HSR_EC_SVC_HYP]	= handle_svc_hyp,
> >> diff --git a/arch/arm/kvm/interrupts_head.S
> >b/arch/arm/kvm/interrupts_head.S
> >> index 35e4a3a..f85c447 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -97,6 +97,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	mrs	r8, LR_fiq
> >>  	mrs	r9, SPSR_fiq
> >>  	push	{r2-r9}
> >> +
> >> +	/* DBGDSCR reg */
> >> +	mrc	p14, 0, r2, c0, c1, 0
> >> +	push	{r2}
> >
> >this feels like it should belong in read_cp15_state and not the gp regs
> >portion ?
> >
> 
> Happy to move it. But moving the cp14 regs to read/write_cp15_state still seems no very appropriate. Should I move it to __kvm_vcpu_return and __kvm_vcpu_run?

you should probably rename read_cp15_state to read_coproc_state then.

> 
> Another reason might be that, I want to disable debug mode (DBGDSCR) as early as possible.
> 

Why?  The world-switch code is atomic in that sense is it not?

> >
> >>  .endm
> >>  
> >>  .macro pop_host_regs_mode mode
> >> @@ -111,6 +115,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers all registers, in all modes, except r0 and r1.
> >>   */
> >>  .macro restore_host_regs
> >> +	pop	{r2}
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >> +
> >
> >Why are we reading the DBGDSCRint and writing the DBGDSCRext ?
> 
> Because the DBGDSCRint is read-only, and I borrowed the operation from kernel.
> 
> arch/arm/kernel/hw_breakpoint.c:
> ARM_DBG_READ(c0, c1, 0, dscr)
> ARM_DBG_WRITE(c0, c2, 2, dscr)
> >
> >>  	pop	{r2-r9}
> >>  	msr	r8_fiq, r2
> >>  	msr	r9_fiq, r3
> >> @@ -159,6 +166,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers *all* registers.
> >>   */
> >>  .macro restore_guest_regs
> >> +	/* reset DBGDSCR to disable debug mode */
> >> +	mov	r2, #0
> >> +	mcr	p14, 0, r2, c0, c2, 2
> >
> >Is it valid to write 0 in all all fields of this register?
> 
> I'm afraid of it too, although it tests ok. Does Will have any suggestions?
> >
> >I thought Will expressed concern about accessing this register?  Why is
> >it safe in this context and not before?  It seems from the spec that
> >this can still raise an undefined exception if an external debugger
> >lowers the software debug enable signal.
> >
> >> +
> >>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> >>  	restore_guest_regs_mode abt, #VCPU_ABT_REGS
> >>  	restore_guest_regs_mode und, #VCPU_UND_REGS
> >> @@ -607,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
> >>   * (hardware reset value is 0) */
> >>  .macro set_hdcr operation
> >>  	mrc	p15, 4, r2, c1, c1, 1
> >> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> >> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> >>  	.if \operation == vmentry
> >>  	orr	r2, r2, r3		@ Trap some perfmon accesses
> >>  	.else
> >> -- 
> >> 1.7.12.4
> >> 
> >
> >Thanks,
> >-Christoffer
> 
> -- 
> zhichao.huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
  2015-07-01  7:09       ` zichao
@ 2015-07-01  9:00         ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-01  9:00 UTC (permalink / raw)
  To: zichao
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao

On Wed, Jul 01, 2015 at 03:09:35PM +0800, zichao wrote:
> 
> 
> On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
> >> As we're about to trap a bunch of CP14 registers, let's rework
> >> the CP15 handling so it can be generalized and work with multiple
> >> tables.
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/kvm/coproc.c          | 176
> >++++++++++++++++++++++++++---------------
> >>  arch/arm/kvm/interrupts_head.S |   2 +-
> >>  2 files changed, 112 insertions(+), 66 deletions(-)
> >> 
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index 9d283d9..d23395b 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
> >>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
> >>  };
> >>  
> >> +static const struct coproc_reg cp14_regs[] = {
> >> +};
> >> +
> >>  /* Target specific emulation tables */
> >>  static struct kvm_coproc_target_table
> >*target_tables[KVM_ARM_NUM_TARGETS];
> >>  
> >> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const
> >struct coproc_params *params,
> >>  	return NULL;
> >>  }
> >>  
> >> -static int emulate_cp15(struct kvm_vcpu *vcpu,
> >> -			const struct coproc_params *params)
> >> +/*
> >> + * emulate_cp --  tries to match a cp14/cp15 access in a handling
> >table,
> >> + *                and call the corresponding trap handler.
> >> + *
> >> + * @params: pointer to the descriptor of the access
> >> + * @table: array of trap descriptors
> >> + * @num: size of the trap descriptor array
> >> + *
> >> + * Return 0 if the access has been handled, and -1 if not.
> >> + */
> >> +static int emulate_cp(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_params *params,
> >> +			const struct coproc_reg *table,
> >> +			size_t num)
> >>  {
> >> -	size_t num;
> >> -	const struct coproc_reg *table, *r;
> >> -
> >> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
> >> -				   params->CRm, params->Op2, params->is_write);
> >> +	const struct coproc_reg *r;
> >>  
> >> -	table = get_target_table(vcpu->arch.target, &num);
> >> +	if (!table)
> >> +		return -1;	/* Not handled */
> >>  
> >> -	/* Search target-specific then generic table. */
> >>  	r = find_reg(params, table, num);
> >> -	if (!r)
> >> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
> >>  
> >> -	if (likely(r)) {
> >> +	if (r) {
> >>  		/* If we don't have an accessor, we should never get here! */
> >>  		BUG_ON(!r->access);
> >>  
> >>  		if (likely(r->access(vcpu, params, r))) {
> >>  			/* Skip instruction, since it was emulated */
> >>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -			return 1;
> >>  		}
> >> -		/* If access function fails, it should complain. */
> >> -	} else {
> >> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
> >> -			*vcpu_pc(vcpu));
> >> -		print_cp_instr(params);
> >> +
> >> +		/* Handled */
> >> +		return 0;
> >>  	}
> >> +
> >> +	/* Not handled */
> >> +	return -1;
> >> +}
> >> +
> >> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
> >> +				const struct coproc_params *params)
> >> +{
> >> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
> >> +	int cp;
> >> +
> >> +	switch (hsr_ec) {
> >> +	case HSR_EC_CP15_32:
> >> +	case HSR_EC_CP15_64:
> >> +		cp = 15;
> >> +		break;
> >> +	case HSR_EC_CP14_MR:
> >> +	case HSR_EC_CP14_64:
> >> +		cp = 14;
> >> +		break;
> >> +	default:
> >> +		WARN_ON((cp = -1));
> >> +	}
> >> +
> >> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
> >> +		cp, *vcpu_pc(vcpu));
> >> +	print_cp_instr(params);
> >>  	kvm_inject_undefined(vcpu);
> >> -	return 1;
> >>  }
> >>  
> >> -/**
> >> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
> >access
> >> - * @vcpu: The VCPU pointer
> >> - * @run:  The kvm_run struct
> >> - */
> >> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_reg *global,
> >> +			size_t nr_global,
> >> +			const struct coproc_reg *target_specific,
> >> +			size_t nr_specific)
> >>  {
> >>  	struct coproc_params params;
> >>  
> >> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >>  	params.CRm = 0;
> >>  
> >> -	return emulate_cp15(vcpu, &params);
> >> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> >> +		return 1;
> >> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> >> +		return 1;
> >> +
> >> +	unhandled_cp_access(vcpu, &params);
> >> +	return 1;
> >>  }
> >>  
> >>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
> >> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu
> >*vcpu,
> >>  			table[i].reset(vcpu, &table[i]);
> >>  }
> >>  
> >> -/**
> >> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
> >access
> >> - * @vcpu: The VCPU pointer
> >> - * @run:  The kvm_run struct
> >> - */
> >> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_reg *global,
> >> +			size_t nr_global,
> >> +			const struct coproc_reg *target_specific,
> >> +			size_t nr_specific)
> >>  {
> >>  	struct coproc_params params;
> >>  
> >> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >>  	params.Rt2 = 0;
> >>  
> >> -	return emulate_cp15(vcpu, &params);
> >> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> >> +		return 1;
> >> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> >> +		return 1;
> >> +
> >> +	unhandled_cp_access(vcpu, &params);
> >> +	return 1;
> >>  }
> >>  
> >>  /**
> >> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
> >access
> >>   * @vcpu: The VCPU pointer
> >>   * @run:  The kvm_run struct
> >>   */
> >> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>  {
> >> -	struct coproc_params params;
> >> +	const struct coproc_reg *target_specific;
> >> +	size_t num;
> >>  
> >> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> -	params.is_64bit = true;
> >> +	target_specific = get_target_table(vcpu->arch.target, &num);
> >> +	return kvm_handle_cp_64(vcpu,
> >> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> >> +				target_specific, num);
> >> +}
> >>  
> >> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> >> -	params.Op2 = 0;
> >> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> -	params.CRm = 0;
> >> +/**
> >> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	const struct coproc_reg *target_specific;
> >> +	size_t num;
> >>  
> >> -	(void)trap_raz_wi(vcpu, &params, NULL);
> >> +	target_specific = get_target_table(vcpu->arch.target, &num);
> >> +	return kvm_handle_cp_32(vcpu,
> >> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> >> +				target_specific, num);
> >> +}
> >>  
> >> -	/* handled */
> >> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -	return 1;
> >> +/**
> >> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	return kvm_handle_cp_64(vcpu,
> >> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> >> +				NULL, 0);
> >>  }
> >>  
> >>  /**
> >> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>   */
> >>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>  {
> >> -	struct coproc_params params;
> >> -
> >> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> -	params.is_64bit = false;
> >> -
> >> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> >> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >> -	params.Rt2 = 0;
> >> -
> >> -	(void)trap_raz_wi(vcpu, &params, NULL);
> >> -
> >> -	/* handled */
> >> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -	return 1;
> >> +	return kvm_handle_cp_32(vcpu,
> >> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> >> +				NULL, 0);
> >>  }
> >>  
> >> 
> >/******************************************************************************
> >> diff --git a/arch/arm/kvm/interrupts_head.S
> >b/arch/arm/kvm/interrupts_head.S
> >> index f85c447..a20b9ad 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
> >>   * (hardware reset value is 0) */
> >>  .macro set_hdcr operation
> >>  	mrc	p15, 4, r2, c1, c1, 1
> >> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> >> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> >
> >why do we stop trapping accesses here?
> 
> Because we didn't finish our trap handlers yet, if we keep the trapping enable here, the vm would not run normally as we use unhandled_cp_access in the trap handlers instead of trap_raz_wi.
> 
> I enable trapping until everything is ok, in the last patch [11/11].
> 
ok, I see.  Feels a bit quirky, but ok.

-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
@ 2015-07-01  9:00         ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-01  9:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 01, 2015 at 03:09:35PM +0800, zichao wrote:
> 
> 
> On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote:
> >> As we're about to trap a bunch of CP14 registers, let's rework
> >> the CP15 handling so it can be generalized and work with multiple
> >> tables.
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/kvm/coproc.c          | 176
> >++++++++++++++++++++++++++---------------
> >>  arch/arm/kvm/interrupts_head.S |   2 +-
> >>  2 files changed, 112 insertions(+), 66 deletions(-)
> >> 
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index 9d283d9..d23395b 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = {
> >>  	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
> >>  };
> >>  
> >> +static const struct coproc_reg cp14_regs[] = {
> >> +};
> >> +
> >>  /* Target specific emulation tables */
> >>  static struct kvm_coproc_target_table
> >*target_tables[KVM_ARM_NUM_TARGETS];
> >>  
> >> @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const
> >struct coproc_params *params,
> >>  	return NULL;
> >>  }
> >>  
> >> -static int emulate_cp15(struct kvm_vcpu *vcpu,
> >> -			const struct coproc_params *params)
> >> +/*
> >> + * emulate_cp --  tries to match a cp14/cp15 access in a handling
> >table,
> >> + *                and call the corresponding trap handler.
> >> + *
> >> + * @params: pointer to the descriptor of the access
> >> + * @table: array of trap descriptors
> >> + * @num: size of the trap descriptor array
> >> + *
> >> + * Return 0 if the access has been handled, and -1 if not.
> >> + */
> >> +static int emulate_cp(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_params *params,
> >> +			const struct coproc_reg *table,
> >> +			size_t num)
> >>  {
> >> -	size_t num;
> >> -	const struct coproc_reg *table, *r;
> >> -
> >> -	trace_kvm_emulate_cp15_imp(params->Op1, params->Rt1, params->CRn,
> >> -				   params->CRm, params->Op2, params->is_write);
> >> +	const struct coproc_reg *r;
> >>  
> >> -	table = get_target_table(vcpu->arch.target, &num);
> >> +	if (!table)
> >> +		return -1;	/* Not handled */
> >>  
> >> -	/* Search target-specific then generic table. */
> >>  	r = find_reg(params, table, num);
> >> -	if (!r)
> >> -		r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs));
> >>  
> >> -	if (likely(r)) {
> >> +	if (r) {
> >>  		/* If we don't have an accessor, we should never get here! */
> >>  		BUG_ON(!r->access);
> >>  
> >>  		if (likely(r->access(vcpu, params, r))) {
> >>  			/* Skip instruction, since it was emulated */
> >>  			kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -			return 1;
> >>  		}
> >> -		/* If access function fails, it should complain. */
> >> -	} else {
> >> -		kvm_err("Unsupported guest CP15 access at: %08lx\n",
> >> -			*vcpu_pc(vcpu));
> >> -		print_cp_instr(params);
> >> +
> >> +		/* Handled */
> >> +		return 0;
> >>  	}
> >> +
> >> +	/* Not handled */
> >> +	return -1;
> >> +}
> >> +
> >> +static void unhandled_cp_access(struct kvm_vcpu *vcpu,
> >> +				const struct coproc_params *params)
> >> +{
> >> +	u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
> >> +	int cp;
> >> +
> >> +	switch (hsr_ec) {
> >> +	case HSR_EC_CP15_32:
> >> +	case HSR_EC_CP15_64:
> >> +		cp = 15;
> >> +		break;
> >> +	case HSR_EC_CP14_MR:
> >> +	case HSR_EC_CP14_64:
> >> +		cp = 14;
> >> +		break;
> >> +	default:
> >> +		WARN_ON((cp = -1));
> >> +	}
> >> +
> >> +	kvm_err("Unsupported guest CP%d access at: %08lx\n",
> >> +		cp, *vcpu_pc(vcpu));
> >> +	print_cp_instr(params);
> >>  	kvm_inject_undefined(vcpu);
> >> -	return 1;
> >>  }
> >>  
> >> -/**
> >> - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
> >access
> >> - * @vcpu: The VCPU pointer
> >> - * @run:  The kvm_run struct
> >> - */
> >> -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp_64(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_reg *global,
> >> +			size_t nr_global,
> >> +			const struct coproc_reg *target_specific,
> >> +			size_t nr_specific)
> >>  {
> >>  	struct coproc_params params;
> >>  
> >> @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >>  	params.CRm = 0;
> >>  
> >> -	return emulate_cp15(vcpu, &params);
> >> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> >> +		return 1;
> >> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> >> +		return 1;
> >> +
> >> +	unhandled_cp_access(vcpu, &params);
> >> +	return 1;
> >>  }
> >>  
> >>  static void reset_coproc_regs(struct kvm_vcpu *vcpu,
> >> @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu
> >*vcpu,
> >>  			table[i].reset(vcpu, &table[i]);
> >>  }
> >>  
> >> -/**
> >> - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
> >access
> >> - * @vcpu: The VCPU pointer
> >> - * @run:  The kvm_run struct
> >> - */
> >> -int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
> >> +			const struct coproc_reg *global,
> >> +			size_t nr_global,
> >> +			const struct coproc_reg *target_specific,
> >> +			size_t nr_specific)
> >>  {
> >>  	struct coproc_params params;
> >>  
> >> @@ -510,33 +546,57 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>  	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >>  	params.Rt2 = 0;
> >>  
> >> -	return emulate_cp15(vcpu, &params);
> >> +	if (!emulate_cp(vcpu, &params, target_specific, nr_specific))
> >> +		return 1;
> >> +	if (!emulate_cp(vcpu, &params, global, nr_global))
> >> +		return 1;
> >> +
> >> +	unhandled_cp_access(vcpu, &params);
> >> +	return 1;
> >>  }
> >>  
> >>  /**
> >> - * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15
> >access
> >>   * @vcpu: The VCPU pointer
> >>   * @run:  The kvm_run struct
> >>   */
> >> -int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>  {
> >> -	struct coproc_params params;
> >> +	const struct coproc_reg *target_specific;
> >> +	size_t num;
> >>  
> >> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> -	params.is_64bit = true;
> >> +	target_specific = get_target_table(vcpu->arch.target, &num);
> >> +	return kvm_handle_cp_64(vcpu,
> >> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> >> +				target_specific, num);
> >> +}
> >>  
> >> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 16) & 0xf;
> >> -	params.Op2 = 0;
> >> -	params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> -	params.CRm = 0;
> >> +/**
> >> + * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	const struct coproc_reg *target_specific;
> >> +	size_t num;
> >>  
> >> -	(void)trap_raz_wi(vcpu, &params, NULL);
> >> +	target_specific = get_target_table(vcpu->arch.target, &num);
> >> +	return kvm_handle_cp_32(vcpu,
> >> +				cp15_regs, ARRAY_SIZE(cp15_regs),
> >> +				target_specific, num);
> >> +}
> >>  
> >> -	/* handled */
> >> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -	return 1;
> >> +/**
> >> + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14
> >access
> >> + * @vcpu: The VCPU pointer
> >> + * @run:  The kvm_run struct
> >> + */
> >> +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >> +{
> >> +	return kvm_handle_cp_64(vcpu,
> >> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> >> +				NULL, 0);
> >>  }
> >>  
> >>  /**
> >> @@ -546,23 +606,9 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu,
> >struct kvm_run *run)
> >>   */
> >>  int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>  {
> >> -	struct coproc_params params;
> >> -
> >> -	params.CRm = (kvm_vcpu_get_hsr(vcpu) >> 1) & 0xf;
> >> -	params.Rt1 = (kvm_vcpu_get_hsr(vcpu) >> 5) & 0xf;
> >> -	params.is_write = ((kvm_vcpu_get_hsr(vcpu) & 1) == 0);
> >> -	params.is_64bit = false;
> >> -
> >> -	params.CRn = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;
> >> -	params.Op1 = (kvm_vcpu_get_hsr(vcpu) >> 14) & 0x7;
> >> -	params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;
> >> -	params.Rt2 = 0;
> >> -
> >> -	(void)trap_raz_wi(vcpu, &params, NULL);
> >> -
> >> -	/* handled */
> >> -	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >> -	return 1;
> >> +	return kvm_handle_cp_32(vcpu,
> >> +				cp14_regs, ARRAY_SIZE(cp14_regs),
> >> +				NULL, 0);
> >>  }
> >>  
> >> 
> >/******************************************************************************
> >> diff --git a/arch/arm/kvm/interrupts_head.S
> >b/arch/arm/kvm/interrupts_head.S
> >> index f85c447..a20b9ad 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -618,7 +618,7 @@ ARM_BE8(rev	r6, r6  )
> >>   * (hardware reset value is 0) */
> >>  .macro set_hdcr operation
> >>  	mrc	p15, 4, r2, c1, c1, 1
> >> -	ldr	r3, =(HDCR_TPM|HDCR_TPMCR|HDCR_TDRA|HDCR_TDOSA|HDCR_TDA)
> >> +	ldr	r3, =(HDCR_TPM|HDCR_TPMCR)
> >
> >why do we stop trapping accesses here?
> 
> Because we didn't finish our trap handlers yet, if we keep the trapping enable here, the vm would not run normally as we use unhandled_cp_access in the trap handlers instead of trap_raz_wi.
> 
> I enable trapping until everything is ok, in the last patch [11/11].
> 
ok, I see.  Feels a bit quirky, but ok.

-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-06-30  9:20     ` Christoffer Dall
@ 2015-07-03  9:54       ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-03  9:54 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel



On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
>> The trapping code keeps track of the state of the debug registers,
>> allowing for the switch code to implement a lazy switching strategy.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>>  arch/arm/include/asm/kvm_host.h |  3 +++
>>  arch/arm/kernel/asm-offsets.c   |  1 +
>>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>>  arch/arm/kvm/interrupts_head.S  | 42
>+++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 86 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index ba65e05..4fb64cf 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -64,6 +64,9 @@
>>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>>  
>> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
>> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
>> +
>>  #define ARM_EXCEPTION_RESET	  0
>>  #define ARM_EXCEPTION_UNDEFINED   1
>>  #define ARM_EXCEPTION_SOFTWARE    2
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 3d16820..09b54bf 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>>  	/* System control coprocessor (cp14) */
>>  	u32 cp14[NR_CP14_REGS];
>>  
>> +	/* Debug state */
>> +	u32 debug_flags;
>> +
>>  	/*
>>  	 * Anything that is not used directly from assembly code goes
>>  	 * here.
>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> index 9158de0..e876109 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -185,6 +185,7 @@ int main(void)
>>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
>arch.regs.fiq_regs));
>>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
>arch.regs.usr_regs.ARM_pc));
>>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
>arch.regs.usr_regs.ARM_cpsr));
>> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
>arch.debug_flags));
>>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index eeee648..fc0c2ef 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>>  	return true;
>>  }
>>  
>> +/*
>> + * We want to avoid world-switching all the DBG registers all the
>> + * time:
>> + *
>> + * - If we've touched any debug register, it is likely that we're
>> + *   going to touch more of them. It then makes sense to disable the
>> + *   traps and start doing the save/restore dance
>> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
>> + *   to save/restore the registers, as the guest depends on them.
>> + *
>> + * For this, we use a DIRTY bit, indicating the guest has modified
>the
>> + * debug registers, used as follow:
>> + *
>> + * On guest entry:
>> + * - If the dirty bit is set (because we're coming back from
>trapping),
>> + *   disable the traps, save host registers, restore guest
>registers.
>> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
>> + *   set the dirty bit, disable the traps, save host registers,
>> + *   restore guest registers.
>> + * - Otherwise, enable the traps
>> + *
>> + * On guest exit:
>> + * - If the dirty bit is set, save guest registers, restore host
>> + *   registers and clear the dirty bit. This ensure that the host can
>> + *   now use the debug registers.
>> + *
>> + * Notice:
>> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
>> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
>> + *   We have to do the save/restore dance in this case, because the
>> + *   host and the guest might use their respective debug registers
>> + *   at any moment.
>
>so doesn't this pretty much invalidate the whole saving/dirty effort?
>
>Guests configured from for example multi_v7_defconfig will then act
>like
>this and you will save/restore all these registers always.
>
>Wouldn't a better approach be to enable trapping to hyp mode most of
>the
>time, and simply clear the enabled bit of any host-used break- or
>wathcpoints upon guest entry, perhaps maintaining a bitmap of which
>ones
>must be re-set when exiting the guest, and thereby drastically reducing
>the amount of save/restore code you'd have to perform.
>
>Of course, you'd also have to keep track of whether the guest has any
>breakpoints or watchpoints enabled for when you do the full
>save/restore
>dance.
>
>That would also avoid all issues surrounding accesses to DBGDSCRext
>register I think.

I have thought about it, which means to say, "Since we can't find
whether the guest has any hwbrpts enabled from the DBGDSCR, why
don't we find it from the DBGBCR and DBGWCR?".

Case 1: The host and the guest enable all the hwbrpts.
It's necessary to world switch the debug registers all the time.

Case 2: The host and the guest enable some of the hwbrpts.
It's necessary to world switch the debug registers which are enabled.
But if we want skip thouse registers which aren't enabled, we have to 
keep track of all the debug states both in the host and the guest.
We need to judge which debug registers we should switch, and which 
not. It may bring in a complex logic in the assembly code. And if the
host or guest enabled almost all of the hwbrpts, this operation may
bring in the loss outweights the grain.
Is it acceptable and worthy? If yes, I will do it.

Case 3: Neither the host nor the guest enable any hwbrpts.
It's the case that we can skip the whole world switch thing.
The only problem is that we have to read all the debug registers on each
guest entry to find whether the host enable any hwbrpts or not.
But this case is a normal case, which is worthy to do the efforts to
reduce the saving/restoring overhead.

>
>> + */
>>  static bool trap_debug32(struct kvm_vcpu *vcpu,
>>  			const struct coproc_params *p,
>>  			const struct coproc_reg *r)
>>  {
>> -	if (p->is_write)
>> +	if (p->is_write) {
>>  		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
>> -	else
>> +		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
>> +	} else {
>>  		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +	}
>>  
>>  	return true;
>>  }
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index a20b9ad..5662c39 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -1,4 +1,6 @@
>>  #include <linux/irqchip/arm-gic.h>
>> +#include <asm/hw_breakpoint.h>
>> +#include <asm/kvm_asm.h>
>>  #include <asm/assembler.h>
>>  
>>  #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
>> @@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
>>  .endm
>>  
>> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
>> +.macro skip_debug_state target
>> +	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +	cmp	r5, #KVM_ARM_DEBUG_DIRTY
>> +	bne	\target
>> +1:
>> +.endm
>> +
>> +/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
>> + * is set, we do a full save/restore cycle and disable trapping.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r5, r6
>> + */
>> +.macro compute_debug_state target
>> +	// Check the state of MDSCR_EL1
>> +	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>> +	and	r6, r5, #ARM_DSCR_MDBGEN
>> +	cmp	r6, #0
>
>you can just do 'ands' here, or even tst and you don't have to touch
>r6.

OK. Thanks for pointing out.
>
>> +	beq	9998f	   // Nothing to see there
>> +
>> +	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
>> +	mov	r5, #KVM_ARM_DEBUG_DIRTY
>> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +	b	9999f	   // Don't skip restore
>> +
>> +9998:
>> +	// Otherwise load the flags from memory in case we recently
>> +	// trapped
>> +	skip_debug_state \target
>> +9999:
>> +.endm
>> +
>> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
>> +.macro clear_debug_dirty_bit
>> +	mov	r5, #0
>> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +.endm
>> +
>>  /*
>>   * Save the VGIC CPU state into memory
>>   *
>> -- 
>> 1.7.12.4
>> 
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-03  9:54       ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-03  9:54 UTC (permalink / raw)
  To: linux-arm-kernel



On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
>> The trapping code keeps track of the state of the debug registers,
>> allowing for the switch code to implement a lazy switching strategy.
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>>  arch/arm/include/asm/kvm_host.h |  3 +++
>>  arch/arm/kernel/asm-offsets.c   |  1 +
>>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>>  arch/arm/kvm/interrupts_head.S  | 42
>+++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 86 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index ba65e05..4fb64cf 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -64,6 +64,9 @@
>>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>>  
>> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
>> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
>> +
>>  #define ARM_EXCEPTION_RESET	  0
>>  #define ARM_EXCEPTION_UNDEFINED   1
>>  #define ARM_EXCEPTION_SOFTWARE    2
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 3d16820..09b54bf 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>>  	/* System control coprocessor (cp14) */
>>  	u32 cp14[NR_CP14_REGS];
>>  
>> +	/* Debug state */
>> +	u32 debug_flags;
>> +
>>  	/*
>>  	 * Anything that is not used directly from assembly code goes
>>  	 * here.
>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> index 9158de0..e876109 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -185,6 +185,7 @@ int main(void)
>>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
>arch.regs.fiq_regs));
>>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
>arch.regs.usr_regs.ARM_pc));
>>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
>arch.regs.usr_regs.ARM_cpsr));
>> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
>arch.debug_flags));
>>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
>> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> index eeee648..fc0c2ef 100644
>> --- a/arch/arm/kvm/coproc.c
>> +++ b/arch/arm/kvm/coproc.c
>> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>>  	return true;
>>  }
>>  
>> +/*
>> + * We want to avoid world-switching all the DBG registers all the
>> + * time:
>> + *
>> + * - If we've touched any debug register, it is likely that we're
>> + *   going to touch more of them. It then makes sense to disable the
>> + *   traps and start doing the save/restore dance
>> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
>> + *   to save/restore the registers, as the guest depends on them.
>> + *
>> + * For this, we use a DIRTY bit, indicating the guest has modified
>the
>> + * debug registers, used as follow:
>> + *
>> + * On guest entry:
>> + * - If the dirty bit is set (because we're coming back from
>trapping),
>> + *   disable the traps, save host registers, restore guest
>registers.
>> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
>> + *   set the dirty bit, disable the traps, save host registers,
>> + *   restore guest registers.
>> + * - Otherwise, enable the traps
>> + *
>> + * On guest exit:
>> + * - If the dirty bit is set, save guest registers, restore host
>> + *   registers and clear the dirty bit. This ensure that the host can
>> + *   now use the debug registers.
>> + *
>> + * Notice:
>> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
>> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
>> + *   We have to do the save/restore dance in this case, because the
>> + *   host and the guest might use their respective debug registers
>> + *   at any moment.
>
>so doesn't this pretty much invalidate the whole saving/dirty effort?
>
>Guests configured from for example multi_v7_defconfig will then act
>like
>this and you will save/restore all these registers always.
>
>Wouldn't a better approach be to enable trapping to hyp mode most of
>the
>time, and simply clear the enabled bit of any host-used break- or
>wathcpoints upon guest entry, perhaps maintaining a bitmap of which
>ones
>must be re-set when exiting the guest, and thereby drastically reducing
>the amount of save/restore code you'd have to perform.
>
>Of course, you'd also have to keep track of whether the guest has any
>breakpoints or watchpoints enabled for when you do the full
>save/restore
>dance.
>
>That would also avoid all issues surrounding accesses to DBGDSCRext
>register I think.

I have thought about it, which means to say, "Since we can't find
whether the guest has any hwbrpts enabled from the DBGDSCR, why
don't we find it from the DBGBCR and DBGWCR?".

Case 1: The host and the guest enable all the hwbrpts.
It's necessary to world switch the debug registers all the time.

Case 2: The host and the guest enable some of the hwbrpts.
It's necessary to world switch the debug registers which are enabled.
But if we want skip thouse registers which aren't enabled, we have to 
keep track of all the debug states both in the host and the guest.
We need to judge which debug registers we should switch, and which 
not. It may bring in a complex logic in the assembly code. And if the
host or guest enabled almost all of the hwbrpts, this operation may
bring in the loss outweights the grain.
Is it acceptable and worthy? If yes, I will do it.

Case 3: Neither the host nor the guest enable any hwbrpts.
It's the case that we can skip the whole world switch thing.
The only problem is that we have to read all the debug registers on each
guest entry to find whether the host enable any hwbrpts or not.
But this case is a normal case, which is worthy to do the efforts to
reduce the saving/restoring overhead.

>
>> + */
>>  static bool trap_debug32(struct kvm_vcpu *vcpu,
>>  			const struct coproc_params *p,
>>  			const struct coproc_reg *r)
>>  {
>> -	if (p->is_write)
>> +	if (p->is_write) {
>>  		vcpu->arch.cp14[r->reg] = *vcpu_reg(vcpu, p->Rt1);
>> -	else
>> +		vcpu->arch.debug_flags |= KVM_ARM_DEBUG_DIRTY;
>> +	} else {
>>  		*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.cp14[r->reg];
>> +	}
>>  
>>  	return true;
>>  }
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index a20b9ad..5662c39 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -1,4 +1,6 @@
>>  #include <linux/irqchip/arm-gic.h>
>> +#include <asm/hw_breakpoint.h>
>> +#include <asm/kvm_asm.h>
>>  #include <asm/assembler.h>
>>  
>>  #define VCPU_USR_REG(_reg_nr)	(VCPU_USR_REGS + (_reg_nr * 4))
>> @@ -407,6 +409,46 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	mcr	p15, 2, r12, c0, c0, 0	@ CSSELR
>>  .endm
>>  
>> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
>> +.macro skip_debug_state target
>> +	ldr	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +	cmp	r5, #KVM_ARM_DEBUG_DIRTY
>> +	bne	\target
>> +1:
>> +.endm
>> +
>> +/* Compute debug state: If ARM_DSCR_MDBGEN or KVM_ARM_DEBUG_DIRTY
>> + * is set, we do a full save/restore cycle and disable trapping.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r5, r6
>> + */
>> +.macro compute_debug_state target
>> +	// Check the state of MDSCR_EL1
>> +	ldr	r5, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>> +	and	r6, r5, #ARM_DSCR_MDBGEN
>> +	cmp	r6, #0
>
>you can just do 'ands' here, or even tst and you don't have to touch
>r6.

OK. Thanks for pointing out.
>
>> +	beq	9998f	   // Nothing to see there
>> +
>> +	// If ARM_DSCR_MDBGEN bit was set, we must set the flag
>> +	mov	r5, #KVM_ARM_DEBUG_DIRTY
>> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +	b	9999f	   // Don't skip restore
>> +
>> +9998:
>> +	// Otherwise load the flags from memory in case we recently
>> +	// trapped
>> +	skip_debug_state \target
>> +9999:
>> +.endm
>> +
>> +/* Assume vcpu pointer in vcpu reg, clobbers r5 */
>> +.macro clear_debug_dirty_bit
>> +	mov	r5, #0
>> +	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>> +.endm
>> +
>>  /*
>>   * Save the VGIC CPU state into memory
>>   *
>> -- 
>> 1.7.12.4
>> 
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
  2015-06-30 13:15     ` Christoffer Dall
@ 2015-07-03 10:06       ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-03 10:06 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	will.deacon, huangzhichao



On June 30, 2015 9:15:22 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
>> Implement switching of the debug registers. While the number
>> of registers is massive, CPUs usually don't implement them all
>> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
>> of 22 registers "only").
>> 
>> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
>> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
>> 
>> We have to do the save/restore dance in this case, because the host
>> and the guest might use their respective debug registers at any moment.
>
>this sounds expensive, and I suggested an alternative approach in the
>previsou patch.  In any case, measuring the impact on this on hardware
>would be a great idea...
>
>> 
>> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
>> the debug registers as dirty, we only save/resotre DBGDSCR.
>
>restore
>
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/interrupts.S      |  16 +++
>>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 263 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index 79caf79..d626275 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
>>  	read_cp15_state store_to_vcpu = 0
>>  	write_cp15_state read_from_vcpu = 1
>>  
>> +	@ Store hardware CP14 state and load guest state
>> +	compute_debug_state 1f
>> +	bl __save_host_debug_regs
>> +	bl __restore_guest_debug_regs
>> +
>> +1:
>>  	@ If the host kernel has not been configured with VFPv3 support,
>>  	@ then it is safer if we deny guests from using it as well.
>>  #ifdef CONFIG_VFPv3
>> @@ -201,6 +207,16 @@ after_vfp_restore:
>>  	mrc	p15, 0, r2, c0, c0, 5
>>  	mcr	p15, 4, r2, c0, c0, 5
>>  
>> +	@ Store guest CP14 state and restore host state
>> +	skip_debug_state 1f
>> +	bl __save_guest_debug_regs
>> +	bl __restore_host_debug_regs
>> +	/* Clear the dirty flag for the next run, as all the state has
>> +	 * already been saved. Note that we nuke the whole 32bit word.
>> +	 * If we ever add more flags, we'll have to be more careful...
>> +	 */
>> +	clear_debug_dirty_bit
>> +1:
>>  	@ Store guest CP15 state and restore host state
>>  	read_cp15_state store_to_vcpu = 1
>>  	write_cp15_state read_from_vcpu = 0
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index 5662c39..ed406be 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -7,6 +7,7 @@
>>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
>>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
>>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
>> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) *
>4))
>>  
>>  /*
>>   * Many of these macros need to access the VCPU structure, which is
>always
>> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers *all* registers.
>>   */
>>  .macro restore_guest_regs
>> -	/* reset DBGDSCR to disable debug mode */
>> -	mov	r2, #0
>> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>>  	mcr	p14, 0, r2, c0, c2, 2
>>  
>>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
>>  	save_guest_regs_mode und, #VCPU_UND_REGS
>>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
>> +
>> +	/* DBGDSCR reg */
>> +	mrc	p14, 0, r2, c0, c1, 0
>> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>>  .endm
>>  
>>  /* Reads cp15 registers from hardware and stores them in memory
>> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>>  .endm
>>  
>> +/* Assume r11/r12 in used, clobbers r2-r10 */
>> +.macro cp14_read_and_push Op2 skip_num
>> +	cmp	\skip_num, #8
>> +	// if (skip_num >= 8) then skip c8-c15 directly
>> +	bge	1f
>> +	adr	r2, 9998f
>> +	add	r2, r2, \skip_num, lsl #2
>> +	bx	r2
>> +1:
>> +	adr	r2, 9999f
>> +	sub	r3, \skip_num, #8
>> +	add	r2, r2, r3, lsl #2
>> +	bx	r2
>> +9998:
>> +	mrc	p14, 0, r10, c0, c15, \Op2
>> +	mrc	p14, 0, r9, c0, c14, \Op2
>> +	mrc	p14, 0, r8, c0, c13, \Op2
>> +	mrc	p14, 0, r7, c0, c12, \Op2
>> +	mrc	p14, 0, r6, c0, c11, \Op2
>> +	mrc	p14, 0, r5, c0, c10, \Op2
>> +	mrc	p14, 0, r4, c0, c9, \Op2
>> +	mrc	p14, 0, r3, c0, c8, \Op2
>> +	push	{r3-r10}
>
>you probably don't want to do more stores to memory than required

Yeah, there is no need to push some registers, but I can't find a better 
way to optimize it, is there any precedents that I can refer to?

Imaging that there are only 2 hwbrpts available, BCR0/BCR1, and we
can code like this:

Save:
		 jump to 1
    save BCR2 to r5
1:
    save BCR1 to r4
    save BCR0 to r3
    push {r3-r5}

Restore:
    pop {r3-r5}
    jump to 1
    restore r5 to BCR2
1:
    restore r4 to BCR1
    restore r3 to BCR0

Buf if we want to only push the registers we acutally need, we have to
code like this:

Save:
    jump to 1
    save BCR2 to r5
    push r5
1:
    save BCR1 to r4
    push r4
    save BCR0 to r3
    push r3

Resotre:
    jump to 1
    pop r5
    restore r5 to BCR2
1:
    pop r4
    restore r4 to BCR1
    pop r3
    restore r3 to BCR0

Then we might entercounter a mistake on restoring, as we want to pop
r4, but actually we pop r3.

>
>> +9999:
>> +	mrc	p14, 0, r10, c0, c7, \Op2
>> +	mrc	p14, 0, r9, c0, c6, \Op2
>> +	mrc	p14, 0, r8, c0, c5, \Op2
>> +	mrc	p14, 0, r7, c0, c4, \Op2
>> +	mrc	p14, 0, r6, c0, c3, \Op2
>> +	mrc	p14, 0, r5, c0, c2, \Op2
>> +	mrc	p14, 0, r4, c0, c1, \Op2
>> +	mrc	p14, 0, r3, c0, c0, \Op2
>> +	push	{r3-r10}
>
>same
>
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r10 */
>> +.macro cp14_pop_and_write Op2 skip_num
>> +	cmp	\skip_num, #8
>> +	// if (skip_num >= 8) then skip c8-c15 directly
>> +	bge	1f
>> +	adr	r2, 9998f
>> +	add	r2, r2, \skip_num, lsl #2
>> +	pop	{r3-r10}
>
>you probably don't want to do more loads from memory than required
>
>> +	bx	r2
>> +1:
>> +	adr	r2, 9999f
>> +	sub	r3, \skip_num, #8
>> +	add	r2, r2, r3, lsl #2
>> +	pop	{r3-r10}
>
>same
>
>> +	bx	r2
>> +
>> +9998:
>> +	mcr	p14, 0, r10, c0, c15, \Op2
>> +	mcr	p14, 0, r9, c0, c14, \Op2
>> +	mcr	p14, 0, r8, c0, c13, \Op2
>> +	mcr	p14, 0, r7, c0, c12, \Op2
>> +	mcr	p14, 0, r6, c0, c11, \Op2
>> +	mcr	p14, 0, r5, c0, c10, \Op2
>> +	mcr	p14, 0, r4, c0, c9, \Op2
>> +	mcr	p14, 0, r3, c0, c8, \Op2
>> +
>> +	pop	{r3-r10}
>> +9999:
>> +	mcr	p14, 0, r10, c0, c7, \Op2
>> +	mcr	p14, 0, r9, c0, c6, \Op2
>> +	mcr	p14, 0, r8, c0, c5, \Op2
>> +	mcr	p14, 0, r7, c0, c4, \Op2
>> +	mcr	p14, 0, r6, c0, c3, \Op2
>> +	mcr	p14, 0, r5, c0, c2, \Op2
>> +	mcr	p14, 0, r4, c0, c1, \Op2
>> +	mcr	p14, 0, r3, c0, c0, \Op2
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r3 */
>> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
>> +	adr	r3, 1f
>> +	add	r3, r3, \skip_num, lsl #3
>> +	bx	r3
>> +1:
>> +	mrc	p14, 0, r2, c0, c15, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
>> +	mrc	p14, 0, r2, c0, c14, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
>> +	mrc	p14, 0, r2, c0, c13, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
>> +	mrc	p14, 0, r2, c0, c12, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
>> +	mrc	p14, 0, r2, c0, c11, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
>> +	mrc	p14, 0, r2, c0, c10, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
>> +	mrc	p14, 0, r2, c0, c9, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
>> +	mrc	p14, 0, r2, c0, c8, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
>> +	mrc	p14, 0, r2, c0, c7, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
>> +	mrc	p14, 0, r2, c0, c6, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
>> +	mrc	p14, 0, r2, c0, c5, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
>> +	mrc	p14, 0, r2, c0, c4, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
>> +	mrc	p14, 0, r2, c0, c3, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
>> +	mrc	p14, 0, r2, c0, c2, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
>> +	mrc	p14, 0, r2, c0, c1, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
>> +	mrc	p14, 0, r2, c0, c0, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r3 */
>> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
>> +	adr	r3, 1f
>> +	add	r3, r3, \skip_num, lsl #3
>> +	bx	r3
>> +1:
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
>> +	mcr	p14, 0, r2, c0, c15, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
>> +	mcr	p14, 0, r2, c0, c14, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
>> +	mcr	p14, 0, r2, c0, c13, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
>> +	mcr	p14, 0, r2, c0, c12, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
>> +	mcr	p14, 0, r2, c0, c11, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
>> +	mcr	p14, 0, r2, c0, c10, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
>> +	mcr	p14, 0, r2, c0, c9, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
>> +	mcr	p14, 0, r2, c0, c8, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
>> +	mcr	p14, 0, r2, c0, c7, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
>> +	mcr	p14, 0, r2, c0, c6, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
>> +	mcr	p14, 0, r2, c0, c5, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
>> +	mcr	p14, 0, r2, c0, c4, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
>> +	mcr	p14, 0, r2, c0, c3, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
>> +	mcr	p14, 0, r2, c0, c2, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
>> +	mcr	p14, 0, r2, c0, c1, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
>> +	mcr	p14, 0, r2, c0, c0, \Op2
>> +.endm
>
>can you not find some way of unifying cp14_pop_and_write with
>cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?
>
>Probably having two separate structs for the VFP state on the vcpu
>struct
>for both the guest and the host state is one possible way of doing so.
>

OK, I will do it.
Would you like me to rename the struct vfp_hard_struct, and add 
host_cp14_state in there, or add a new struct host_cp14_state in the
kvm_vcpu_arch?

>> +
>> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
>> +.macro read_hw_dbg_num
>> +	mrc	p14, 0, r2, c0, c0, 0
>> +	ubfx	r11, r2, #24, #4
>> +	add	r11, r11, #1		// Extract BRPs
>> +	ubfx	r12, r2, #28, #4
>> +	add	r12, r12, #1		// Extract WRPs
>> +	mov	r2, #16
>> +	sub	r11, r2, r11		// How many BPs to skip
>> +	sub	r12, r2, r12		// How many WPs to skip
>> +.endm
>> +
>> +/* Reads cp14 registers from hardware.
>
>You have a lot of multi-line comments in these patches which don't
>start
>with a separate '/*' line, as dictated by the Linux kernel coding
>style.
>So far, I've ignored this, but please fix all these throughout the
>series when you respin.
>
>> + * Writes cp14 registers in-order to the stack.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro save_host_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_read_and_push #4, r11	@ DBGBVR
>> +	cp14_read_and_push #5, r11	@ DBGBCR
>> +	cp14_read_and_push #6, r12	@ DBGWVR
>> +	cp14_read_and_push #7, r12	@ DBGWCR
>> +.endm
>> +
>> +/* Reads cp14 registers from hardware.
>> + * Writes cp14 registers in-order to the VCPU struct pointed to by
>vcpup.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro save_guest_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_read_and_str #4, cp14_DBGBVR0, r11
>
>why do you need the has before the op2 field?

Sorry, I can't quite understand.

>
>> +	cp14_read_and_str #5, cp14_DBGBCR0, r11
>> +	cp14_read_and_str #6, cp14_DBGWVR0, r12
>> +	cp14_read_and_str #7, cp14_DBGWCR0, r12
>> +.endm
>> +
>> +/* Reads cp14 registers in-order from the stack.
>> + * Writes cp14 registers to hardware.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro restore_host_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_pop_and_write #4, r11	@ DBGBVR
>> +	cp14_pop_and_write #5, r11	@ DBGBCR
>> +	cp14_pop_and_write #6, r12	@ DBGWVR
>> +	cp14_pop_and_write #7, r12	@ DBGWCR
>> +.endm
>> +
>> +/* Reads cp14 registers in-order from the VCPU struct pointed to by
>vcpup
>> + * Writes cp14 registers to hardware.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro restore_guest_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
>> +	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
>> +	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
>> +	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
>> +.endm
>> +
>>  /*
>>   * Save the VGIC CPU state into memory
>>   *
>> @@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
>>  .macro load_vcpu
>>  	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
>>  .endm
>> +
>> +__save_host_debug_regs:
>> +	save_host_debug_regs
>> +	bx	lr
>> +
>> +__save_guest_debug_regs:
>> +	save_guest_debug_regs
>> +	bx	lr
>> +
>> +__restore_host_debug_regs:
>> +	restore_host_debug_regs
>> +	bx	lr
>> +
>> +__restore_guest_debug_regs:
>> +	restore_guest_debug_regs
>> +	bx	lr
>> -- 
>> 1.7.12.4
>> 
>
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
@ 2015-07-03 10:06       ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-03 10:06 UTC (permalink / raw)
  To: linux-arm-kernel



On June 30, 2015 9:15:22 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
>> Implement switching of the debug registers. While the number
>> of registers is massive, CPUs usually don't implement them all
>> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
>> of 22 registers "only").
>> 
>> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
>> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
>> 
>> We have to do the save/restore dance in this case, because the host
>> and the guest might use their respective debug registers at any moment.
>
>this sounds expensive, and I suggested an alternative approach in the
>previsou patch.  In any case, measuring the impact on this on hardware
>would be a great idea...
>
>> 
>> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
>> the debug registers as dirty, we only save/resotre DBGDSCR.
>
>restore
>
>> 
>> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> ---
>>  arch/arm/kvm/interrupts.S      |  16 +++
>>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 263 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index 79caf79..d626275 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
>>  	read_cp15_state store_to_vcpu = 0
>>  	write_cp15_state read_from_vcpu = 1
>>  
>> +	@ Store hardware CP14 state and load guest state
>> +	compute_debug_state 1f
>> +	bl __save_host_debug_regs
>> +	bl __restore_guest_debug_regs
>> +
>> +1:
>>  	@ If the host kernel has not been configured with VFPv3 support,
>>  	@ then it is safer if we deny guests from using it as well.
>>  #ifdef CONFIG_VFPv3
>> @@ -201,6 +207,16 @@ after_vfp_restore:
>>  	mrc	p15, 0, r2, c0, c0, 5
>>  	mcr	p15, 4, r2, c0, c0, 5
>>  
>> +	@ Store guest CP14 state and restore host state
>> +	skip_debug_state 1f
>> +	bl __save_guest_debug_regs
>> +	bl __restore_host_debug_regs
>> +	/* Clear the dirty flag for the next run, as all the state has
>> +	 * already been saved. Note that we nuke the whole 32bit word.
>> +	 * If we ever add more flags, we'll have to be more careful...
>> +	 */
>> +	clear_debug_dirty_bit
>> +1:
>>  	@ Store guest CP15 state and restore host state
>>  	read_cp15_state store_to_vcpu = 1
>>  	write_cp15_state read_from_vcpu = 0
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index 5662c39..ed406be 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -7,6 +7,7 @@
>>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
>>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
>>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
>> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) *
>4))
>>  
>>  /*
>>   * Many of these macros need to access the VCPU structure, which is
>always
>> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>   * Clobbers *all* registers.
>>   */
>>  .macro restore_guest_regs
>> -	/* reset DBGDSCR to disable debug mode */
>> -	mov	r2, #0
>> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>>  	mcr	p14, 0, r2, c0, c2, 2
>>  
>>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
>> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
>>  	save_guest_regs_mode und, #VCPU_UND_REGS
>>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
>> +
>> +	/* DBGDSCR reg */
>> +	mrc	p14, 0, r2, c0, c1, 0
>> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
>>  .endm
>>  
>>  /* Reads cp15 registers from hardware and stores them in memory
>> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
>>  .endm
>>  
>> +/* Assume r11/r12 in used, clobbers r2-r10 */
>> +.macro cp14_read_and_push Op2 skip_num
>> +	cmp	\skip_num, #8
>> +	// if (skip_num >= 8) then skip c8-c15 directly
>> +	bge	1f
>> +	adr	r2, 9998f
>> +	add	r2, r2, \skip_num, lsl #2
>> +	bx	r2
>> +1:
>> +	adr	r2, 9999f
>> +	sub	r3, \skip_num, #8
>> +	add	r2, r2, r3, lsl #2
>> +	bx	r2
>> +9998:
>> +	mrc	p14, 0, r10, c0, c15, \Op2
>> +	mrc	p14, 0, r9, c0, c14, \Op2
>> +	mrc	p14, 0, r8, c0, c13, \Op2
>> +	mrc	p14, 0, r7, c0, c12, \Op2
>> +	mrc	p14, 0, r6, c0, c11, \Op2
>> +	mrc	p14, 0, r5, c0, c10, \Op2
>> +	mrc	p14, 0, r4, c0, c9, \Op2
>> +	mrc	p14, 0, r3, c0, c8, \Op2
>> +	push	{r3-r10}
>
>you probably don't want to do more stores to memory than required

Yeah, there is no need to push some registers, but I can't find a better 
way to optimize it, is there any precedents that I can refer to?

Imaging that there are only 2 hwbrpts available, BCR0/BCR1, and we
can code like this:

Save:
		 jump to 1
    save BCR2 to r5
1:
    save BCR1 to r4
    save BCR0 to r3
    push {r3-r5}

Restore:
    pop {r3-r5}
    jump to 1
    restore r5 to BCR2
1:
    restore r4 to BCR1
    restore r3 to BCR0

Buf if we want to only push the registers we acutally need, we have to
code like this:

Save:
    jump to 1
    save BCR2 to r5
    push r5
1:
    save BCR1 to r4
    push r4
    save BCR0 to r3
    push r3

Resotre:
    jump to 1
    pop r5
    restore r5 to BCR2
1:
    pop r4
    restore r4 to BCR1
    pop r3
    restore r3 to BCR0

Then we might entercounter a mistake on restoring, as we want to pop
r4, but actually we pop r3.

>
>> +9999:
>> +	mrc	p14, 0, r10, c0, c7, \Op2
>> +	mrc	p14, 0, r9, c0, c6, \Op2
>> +	mrc	p14, 0, r8, c0, c5, \Op2
>> +	mrc	p14, 0, r7, c0, c4, \Op2
>> +	mrc	p14, 0, r6, c0, c3, \Op2
>> +	mrc	p14, 0, r5, c0, c2, \Op2
>> +	mrc	p14, 0, r4, c0, c1, \Op2
>> +	mrc	p14, 0, r3, c0, c0, \Op2
>> +	push	{r3-r10}
>
>same
>
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r10 */
>> +.macro cp14_pop_and_write Op2 skip_num
>> +	cmp	\skip_num, #8
>> +	// if (skip_num >= 8) then skip c8-c15 directly
>> +	bge	1f
>> +	adr	r2, 9998f
>> +	add	r2, r2, \skip_num, lsl #2
>> +	pop	{r3-r10}
>
>you probably don't want to do more loads from memory than required
>
>> +	bx	r2
>> +1:
>> +	adr	r2, 9999f
>> +	sub	r3, \skip_num, #8
>> +	add	r2, r2, r3, lsl #2
>> +	pop	{r3-r10}
>
>same
>
>> +	bx	r2
>> +
>> +9998:
>> +	mcr	p14, 0, r10, c0, c15, \Op2
>> +	mcr	p14, 0, r9, c0, c14, \Op2
>> +	mcr	p14, 0, r8, c0, c13, \Op2
>> +	mcr	p14, 0, r7, c0, c12, \Op2
>> +	mcr	p14, 0, r6, c0, c11, \Op2
>> +	mcr	p14, 0, r5, c0, c10, \Op2
>> +	mcr	p14, 0, r4, c0, c9, \Op2
>> +	mcr	p14, 0, r3, c0, c8, \Op2
>> +
>> +	pop	{r3-r10}
>> +9999:
>> +	mcr	p14, 0, r10, c0, c7, \Op2
>> +	mcr	p14, 0, r9, c0, c6, \Op2
>> +	mcr	p14, 0, r8, c0, c5, \Op2
>> +	mcr	p14, 0, r7, c0, c4, \Op2
>> +	mcr	p14, 0, r6, c0, c3, \Op2
>> +	mcr	p14, 0, r5, c0, c2, \Op2
>> +	mcr	p14, 0, r4, c0, c1, \Op2
>> +	mcr	p14, 0, r3, c0, c0, \Op2
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r3 */
>> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
>> +	adr	r3, 1f
>> +	add	r3, r3, \skip_num, lsl #3
>> +	bx	r3
>> +1:
>> +	mrc	p14, 0, r2, c0, c15, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
>> +	mrc	p14, 0, r2, c0, c14, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
>> +	mrc	p14, 0, r2, c0, c13, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
>> +	mrc	p14, 0, r2, c0, c12, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
>> +	mrc	p14, 0, r2, c0, c11, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
>> +	mrc	p14, 0, r2, c0, c10, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
>> +	mrc	p14, 0, r2, c0, c9, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
>> +	mrc	p14, 0, r2, c0, c8, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
>> +	mrc	p14, 0, r2, c0, c7, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
>> +	mrc	p14, 0, r2, c0, c6, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
>> +	mrc	p14, 0, r2, c0, c5, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
>> +	mrc	p14, 0, r2, c0, c4, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
>> +	mrc	p14, 0, r2, c0, c3, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
>> +	mrc	p14, 0, r2, c0, c2, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
>> +	mrc	p14, 0, r2, c0, c1, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
>> +	mrc	p14, 0, r2, c0, c0, \Op2
>> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
>> +.endm
>> +
>> +/* Assume r11/r12 in used, clobbers r2-r3 */
>> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
>> +	adr	r3, 1f
>> +	add	r3, r3, \skip_num, lsl #3
>> +	bx	r3
>> +1:
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
>> +	mcr	p14, 0, r2, c0, c15, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
>> +	mcr	p14, 0, r2, c0, c14, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
>> +	mcr	p14, 0, r2, c0, c13, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
>> +	mcr	p14, 0, r2, c0, c12, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
>> +	mcr	p14, 0, r2, c0, c11, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
>> +	mcr	p14, 0, r2, c0, c10, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
>> +	mcr	p14, 0, r2, c0, c9, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
>> +	mcr	p14, 0, r2, c0, c8, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
>> +	mcr	p14, 0, r2, c0, c7, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
>> +	mcr	p14, 0, r2, c0, c6, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
>> +	mcr	p14, 0, r2, c0, c5, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
>> +	mcr	p14, 0, r2, c0, c4, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
>> +	mcr	p14, 0, r2, c0, c3, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
>> +	mcr	p14, 0, r2, c0, c2, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
>> +	mcr	p14, 0, r2, c0, c1, \Op2
>> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
>> +	mcr	p14, 0, r2, c0, c0, \Op2
>> +.endm
>
>can you not find some way of unifying cp14_pop_and_write with
>cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?
>
>Probably having two separate structs for the VFP state on the vcpu
>struct
>for both the guest and the host state is one possible way of doing so.
>

OK, I will do it.
Would you like me to rename the struct vfp_hard_struct, and add 
host_cp14_state in there, or add a new struct host_cp14_state in the
kvm_vcpu_arch?

>> +
>> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
>> +.macro read_hw_dbg_num
>> +	mrc	p14, 0, r2, c0, c0, 0
>> +	ubfx	r11, r2, #24, #4
>> +	add	r11, r11, #1		// Extract BRPs
>> +	ubfx	r12, r2, #28, #4
>> +	add	r12, r12, #1		// Extract WRPs
>> +	mov	r2, #16
>> +	sub	r11, r2, r11		// How many BPs to skip
>> +	sub	r12, r2, r12		// How many WPs to skip
>> +.endm
>> +
>> +/* Reads cp14 registers from hardware.
>
>You have a lot of multi-line comments in these patches which don't
>start
>with a separate '/*' line, as dictated by the Linux kernel coding
>style.
>So far, I've ignored this, but please fix all these throughout the
>series when you respin.
>
>> + * Writes cp14 registers in-order to the stack.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro save_host_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_read_and_push #4, r11	@ DBGBVR
>> +	cp14_read_and_push #5, r11	@ DBGBCR
>> +	cp14_read_and_push #6, r12	@ DBGWVR
>> +	cp14_read_and_push #7, r12	@ DBGWCR
>> +.endm
>> +
>> +/* Reads cp14 registers from hardware.
>> + * Writes cp14 registers in-order to the VCPU struct pointed to by
>vcpup.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro save_guest_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_read_and_str #4, cp14_DBGBVR0, r11
>
>why do you need the has before the op2 field?

Sorry, I can't quite understand.

>
>> +	cp14_read_and_str #5, cp14_DBGBCR0, r11
>> +	cp14_read_and_str #6, cp14_DBGWVR0, r12
>> +	cp14_read_and_str #7, cp14_DBGWCR0, r12
>> +.endm
>> +
>> +/* Reads cp14 registers in-order from the stack.
>> + * Writes cp14 registers to hardware.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro restore_host_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_pop_and_write #4, r11	@ DBGBVR
>> +	cp14_pop_and_write #5, r11	@ DBGBCR
>> +	cp14_pop_and_write #6, r12	@ DBGWVR
>> +	cp14_pop_and_write #7, r12	@ DBGWCR
>> +.endm
>> +
>> +/* Reads cp14 registers in-order from the VCPU struct pointed to by
>vcpup
>> + * Writes cp14 registers to hardware.
>> + *
>> + * Assumes vcpu pointer in vcpu reg
>> + *
>> + * Clobbers r2-r12
>> + */
>> +.macro restore_guest_debug_regs
>> +	read_hw_dbg_num
>> +	cp14_ldr_and_write #4, cp14_DBGBVR0, r11
>> +	cp14_ldr_and_write #5, cp14_DBGBCR0, r11
>> +	cp14_ldr_and_write #6, cp14_DBGWVR0, r12
>> +	cp14_ldr_and_write #7, cp14_DBGWCR0, r12
>> +.endm
>> +
>>  /*
>>   * Save the VGIC CPU state into memory
>>   *
>> @@ -684,3 +913,19 @@ ARM_BE8(rev	r6, r6  )
>>  .macro load_vcpu
>>  	mrc	p15, 4, vcpu, c13, c0, 2	@ HTPIDR
>>  .endm
>> +
>> +__save_host_debug_regs:
>> +	save_host_debug_regs
>> +	bx	lr
>> +
>> +__save_guest_debug_regs:
>> +	save_guest_debug_regs
>> +	bx	lr
>> +
>> +__restore_host_debug_regs:
>> +	restore_host_debug_regs
>> +	bx	lr
>> +
>> +__restore_guest_debug_regs:
>> +	restore_guest_debug_regs
>> +	bx	lr
>> -- 
>> 1.7.12.4
>> 
>
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-03  9:54       ` Zhichao Huang
@ 2015-07-03 11:56         ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-03 11:56 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel

On Fri, Jul 03, 2015 at 05:54:47PM +0800, Zhichao Huang wrote:
> 
> 
> On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
> >> The trapping code keeps track of the state of the debug registers,
> >> allowing for the switch code to implement a lazy switching strategy.
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_asm.h  |  3 +++
> >>  arch/arm/include/asm/kvm_host.h |  3 +++
> >>  arch/arm/kernel/asm-offsets.c   |  1 +
> >>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
> >>  arch/arm/kvm/interrupts_head.S  | 42
> >+++++++++++++++++++++++++++++++++++++++++
> >>  5 files changed, 86 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> >> index ba65e05..4fb64cf 100644
> >> --- a/arch/arm/include/asm/kvm_asm.h
> >> +++ b/arch/arm/include/asm/kvm_asm.h
> >> @@ -64,6 +64,9 @@
> >>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
> >>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
> >>  
> >> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
> >> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
> >> +
> >>  #define ARM_EXCEPTION_RESET	  0
> >>  #define ARM_EXCEPTION_UNDEFINED   1
> >>  #define ARM_EXCEPTION_SOFTWARE    2
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index 3d16820..09b54bf 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
> >>  	/* System control coprocessor (cp14) */
> >>  	u32 cp14[NR_CP14_REGS];
> >>  
> >> +	/* Debug state */
> >> +	u32 debug_flags;
> >> +
> >>  	/*
> >>  	 * Anything that is not used directly from assembly code goes
> >>  	 * here.
> >> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> >> index 9158de0..e876109 100644
> >> --- a/arch/arm/kernel/asm-offsets.c
> >> +++ b/arch/arm/kernel/asm-offsets.c
> >> @@ -185,6 +185,7 @@ int main(void)
> >>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
> >arch.regs.fiq_regs));
> >>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
> >arch.regs.usr_regs.ARM_pc));
> >>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
> >arch.regs.usr_regs.ARM_cpsr));
> >> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
> >arch.debug_flags));
> >>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
> >>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
> >>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index eeee648..fc0c2ef 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
> >>  	return true;
> >>  }
> >>  
> >> +/*
> >> + * We want to avoid world-switching all the DBG registers all the
> >> + * time:
> >> + *
> >> + * - If we've touched any debug register, it is likely that we're
> >> + *   going to touch more of them. It then makes sense to disable the
> >> + *   traps and start doing the save/restore dance
> >> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
> >> + *   to save/restore the registers, as the guest depends on them.
> >> + *
> >> + * For this, we use a DIRTY bit, indicating the guest has modified
> >the
> >> + * debug registers, used as follow:
> >> + *
> >> + * On guest entry:
> >> + * - If the dirty bit is set (because we're coming back from
> >trapping),
> >> + *   disable the traps, save host registers, restore guest
> >registers.
> >> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
> >> + *   set the dirty bit, disable the traps, save host registers,
> >> + *   restore guest registers.
> >> + * - Otherwise, enable the traps
> >> + *
> >> + * On guest exit:
> >> + * - If the dirty bit is set, save guest registers, restore host
> >> + *   registers and clear the dirty bit. This ensure that the host can
> >> + *   now use the debug registers.
> >> + *
> >> + * Notice:
> >> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
> >> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
> >> + *   We have to do the save/restore dance in this case, because the
> >> + *   host and the guest might use their respective debug registers
> >> + *   at any moment.
> >
> >so doesn't this pretty much invalidate the whole saving/dirty effort?
> >
> >Guests configured from for example multi_v7_defconfig will then act
> >like
> >this and you will save/restore all these registers always.
> >
> >Wouldn't a better approach be to enable trapping to hyp mode most of
> >the
> >time, and simply clear the enabled bit of any host-used break- or
> >wathcpoints upon guest entry, perhaps maintaining a bitmap of which
> >ones
> >must be re-set when exiting the guest, and thereby drastically reducing
> >the amount of save/restore code you'd have to perform.
> >
> >Of course, you'd also have to keep track of whether the guest has any
> >breakpoints or watchpoints enabled for when you do the full
> >save/restore
> >dance.
> >
> >That would also avoid all issues surrounding accesses to DBGDSCRext
> >register I think.
> 
> I have thought about it, which means to say, "Since we can't find
> whether the guest has any hwbrpts enabled from the DBGDSCR, why
> don't we find it from the DBGBCR and DBGWCR?".
> 
> Case 1: The host and the guest enable all the hwbrpts.
> It's necessary to world switch the debug registers all the time.
> 
> Case 2: The host and the guest enable some of the hwbrpts.
> It's necessary to world switch the debug registers which are enabled.
> But if we want skip thouse registers which aren't enabled, we have to 
> keep track of all the debug states both in the host and the guest.
> We need to judge which debug registers we should switch, and which 
> not. It may bring in a complex logic in the assembly code. And if the
> host or guest enabled almost all of the hwbrpts, this operation may
> bring in the loss outweights the grain.
> Is it acceptable and worthy? If yes, I will do it.
> 
> Case 3: Neither the host nor the guest enable any hwbrpts.
> It's the case that we can skip the whole world switch thing.
> The only problem is that we have to read all the debug registers on each
> guest entry to find whether the host enable any hwbrpts or not.
> But this case is a normal case, which is worthy to do the efforts to
> reduce the saving/restoring overhead.
> 
I would never try to do a partial save/restore, just look at the
control registers to see if anything is enabled as an indication of
whether or not we need to do the save/restore of all the registers and
disable trapping.

Reading the guest control registers (from memory) only should be much
faster than saving/restoring the whole lot.  Perhaps there's even a hook
in Linux to figure out if any of the registers are being used?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-03 11:56         ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-03 11:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 03, 2015 at 05:54:47PM +0800, Zhichao Huang wrote:
> 
> 
> On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
> >> The trapping code keeps track of the state of the debug registers,
> >> allowing for the switch code to implement a lazy switching strategy.
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_asm.h  |  3 +++
> >>  arch/arm/include/asm/kvm_host.h |  3 +++
> >>  arch/arm/kernel/asm-offsets.c   |  1 +
> >>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
> >>  arch/arm/kvm/interrupts_head.S  | 42
> >+++++++++++++++++++++++++++++++++++++++++
> >>  5 files changed, 86 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> >> index ba65e05..4fb64cf 100644
> >> --- a/arch/arm/include/asm/kvm_asm.h
> >> +++ b/arch/arm/include/asm/kvm_asm.h
> >> @@ -64,6 +64,9 @@
> >>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
> >>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
> >>  
> >> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
> >> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
> >> +
> >>  #define ARM_EXCEPTION_RESET	  0
> >>  #define ARM_EXCEPTION_UNDEFINED   1
> >>  #define ARM_EXCEPTION_SOFTWARE    2
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index 3d16820..09b54bf 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
> >>  	/* System control coprocessor (cp14) */
> >>  	u32 cp14[NR_CP14_REGS];
> >>  
> >> +	/* Debug state */
> >> +	u32 debug_flags;
> >> +
> >>  	/*
> >>  	 * Anything that is not used directly from assembly code goes
> >>  	 * here.
> >> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> >> index 9158de0..e876109 100644
> >> --- a/arch/arm/kernel/asm-offsets.c
> >> +++ b/arch/arm/kernel/asm-offsets.c
> >> @@ -185,6 +185,7 @@ int main(void)
> >>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
> >arch.regs.fiq_regs));
> >>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
> >arch.regs.usr_regs.ARM_pc));
> >>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
> >arch.regs.usr_regs.ARM_cpsr));
> >> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
> >arch.debug_flags));
> >>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
> >>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
> >>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
> >> index eeee648..fc0c2ef 100644
> >> --- a/arch/arm/kvm/coproc.c
> >> +++ b/arch/arm/kvm/coproc.c
> >> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
> >>  	return true;
> >>  }
> >>  
> >> +/*
> >> + * We want to avoid world-switching all the DBG registers all the
> >> + * time:
> >> + *
> >> + * - If we've touched any debug register, it is likely that we're
> >> + *   going to touch more of them. It then makes sense to disable the
> >> + *   traps and start doing the save/restore dance
> >> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
> >> + *   to save/restore the registers, as the guest depends on them.
> >> + *
> >> + * For this, we use a DIRTY bit, indicating the guest has modified
> >the
> >> + * debug registers, used as follow:
> >> + *
> >> + * On guest entry:
> >> + * - If the dirty bit is set (because we're coming back from
> >trapping),
> >> + *   disable the traps, save host registers, restore guest
> >registers.
> >> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
> >> + *   set the dirty bit, disable the traps, save host registers,
> >> + *   restore guest registers.
> >> + * - Otherwise, enable the traps
> >> + *
> >> + * On guest exit:
> >> + * - If the dirty bit is set, save guest registers, restore host
> >> + *   registers and clear the dirty bit. This ensure that the host can
> >> + *   now use the debug registers.
> >> + *
> >> + * Notice:
> >> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
> >> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
> >> + *   We have to do the save/restore dance in this case, because the
> >> + *   host and the guest might use their respective debug registers
> >> + *   at any moment.
> >
> >so doesn't this pretty much invalidate the whole saving/dirty effort?
> >
> >Guests configured from for example multi_v7_defconfig will then act
> >like
> >this and you will save/restore all these registers always.
> >
> >Wouldn't a better approach be to enable trapping to hyp mode most of
> >the
> >time, and simply clear the enabled bit of any host-used break- or
> >wathcpoints upon guest entry, perhaps maintaining a bitmap of which
> >ones
> >must be re-set when exiting the guest, and thereby drastically reducing
> >the amount of save/restore code you'd have to perform.
> >
> >Of course, you'd also have to keep track of whether the guest has any
> >breakpoints or watchpoints enabled for when you do the full
> >save/restore
> >dance.
> >
> >That would also avoid all issues surrounding accesses to DBGDSCRext
> >register I think.
> 
> I have thought about it, which means to say, "Since we can't find
> whether the guest has any hwbrpts enabled from the DBGDSCR, why
> don't we find it from the DBGBCR and DBGWCR?".
> 
> Case 1: The host and the guest enable all the hwbrpts.
> It's necessary to world switch the debug registers all the time.
> 
> Case 2: The host and the guest enable some of the hwbrpts.
> It's necessary to world switch the debug registers which are enabled.
> But if we want skip thouse registers which aren't enabled, we have to 
> keep track of all the debug states both in the host and the guest.
> We need to judge which debug registers we should switch, and which 
> not. It may bring in a complex logic in the assembly code. And if the
> host or guest enabled almost all of the hwbrpts, this operation may
> bring in the loss outweights the grain.
> Is it acceptable and worthy? If yes, I will do it.
> 
> Case 3: Neither the host nor the guest enable any hwbrpts.
> It's the case that we can skip the whole world switch thing.
> The only problem is that we have to read all the debug registers on each
> guest entry to find whether the host enable any hwbrpts or not.
> But this case is a normal case, which is worthy to do the efforts to
> reduce the saving/restoring overhead.
> 
I would never try to do a partial save/restore, just look at the
control registers to see if anything is enabled as an indication of
whether or not we need to do the save/restore of all the registers and
disable trapping.

Reading the guest control registers (from memory) only should be much
faster than saving/restoring the whole lot.  Perhaps there's even a hook
in Linux to figure out if any of the registers are being used?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
  2015-07-03 10:06       ` Zhichao Huang
@ 2015-07-03 21:05         ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-03 21:05 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, marc.zyngier, will.deacon, huangzhichao, kvmarm, linux-arm-kernel

On Fri, Jul 03, 2015 at 06:06:48PM +0800, Zhichao Huang wrote:
> 
> 
> On June 30, 2015 9:15:22 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
> >> Implement switching of the debug registers. While the number
> >> of registers is massive, CPUs usually don't implement them all
> >> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
> >> of 22 registers "only").
> >> 
> >> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
> >> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
> >> 
> >> We have to do the save/restore dance in this case, because the host
> >> and the guest might use their respective debug registers at any moment.
> >
> >this sounds expensive, and I suggested an alternative approach in the
> >previsou patch.  In any case, measuring the impact on this on hardware
> >would be a great idea...
> >
> >> 
> >> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
> >> the debug registers as dirty, we only save/resotre DBGDSCR.
> >
> >restore
> >
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/kvm/interrupts.S      |  16 +++
> >>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
> >>  2 files changed, 263 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> >> index 79caf79..d626275 100644
> >> --- a/arch/arm/kvm/interrupts.S
> >> +++ b/arch/arm/kvm/interrupts.S
> >> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
> >>  	read_cp15_state store_to_vcpu = 0
> >>  	write_cp15_state read_from_vcpu = 1
> >>  
> >> +	@ Store hardware CP14 state and load guest state
> >> +	compute_debug_state 1f
> >> +	bl __save_host_debug_regs
> >> +	bl __restore_guest_debug_regs
> >> +
> >> +1:
> >>  	@ If the host kernel has not been configured with VFPv3 support,
> >>  	@ then it is safer if we deny guests from using it as well.
> >>  #ifdef CONFIG_VFPv3
> >> @@ -201,6 +207,16 @@ after_vfp_restore:
> >>  	mrc	p15, 0, r2, c0, c0, 5
> >>  	mcr	p15, 4, r2, c0, c0, 5
> >>  
> >> +	@ Store guest CP14 state and restore host state
> >> +	skip_debug_state 1f
> >> +	bl __save_guest_debug_regs
> >> +	bl __restore_host_debug_regs
> >> +	/* Clear the dirty flag for the next run, as all the state has
> >> +	 * already been saved. Note that we nuke the whole 32bit word.
> >> +	 * If we ever add more flags, we'll have to be more careful...
> >> +	 */
> >> +	clear_debug_dirty_bit
> >> +1:
> >>  	@ Store guest CP15 state and restore host state
> >>  	read_cp15_state store_to_vcpu = 1
> >>  	write_cp15_state read_from_vcpu = 0
> >> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> >> index 5662c39..ed406be 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -7,6 +7,7 @@
> >>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
> >>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
> >>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
> >> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) *
> >4))
> >>  
> >>  /*
> >>   * Many of these macros need to access the VCPU structure, which is
> >always
> >> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers *all* registers.
> >>   */
> >>  .macro restore_guest_regs
> >> -	/* reset DBGDSCR to disable debug mode */
> >> -	mov	r2, #0
> >> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> >>  	mcr	p14, 0, r2, c0, c2, 2
> >>  
> >>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> >> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
> >>  	save_guest_regs_mode und, #VCPU_UND_REGS
> >>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
> >> +
> >> +	/* DBGDSCR reg */
> >> +	mrc	p14, 0, r2, c0, c1, 0
> >> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> >>  .endm
> >>  
> >>  /* Reads cp15 registers from hardware and stores them in memory
> >> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> >>  .endm
> >>  
> >> +/* Assume r11/r12 in used, clobbers r2-r10 */
> >> +.macro cp14_read_and_push Op2 skip_num
> >> +	cmp	\skip_num, #8
> >> +	// if (skip_num >= 8) then skip c8-c15 directly
> >> +	bge	1f
> >> +	adr	r2, 9998f
> >> +	add	r2, r2, \skip_num, lsl #2
> >> +	bx	r2
> >> +1:
> >> +	adr	r2, 9999f
> >> +	sub	r3, \skip_num, #8
> >> +	add	r2, r2, r3, lsl #2
> >> +	bx	r2
> >> +9998:
> >> +	mrc	p14, 0, r10, c0, c15, \Op2
> >> +	mrc	p14, 0, r9, c0, c14, \Op2
> >> +	mrc	p14, 0, r8, c0, c13, \Op2
> >> +	mrc	p14, 0, r7, c0, c12, \Op2
> >> +	mrc	p14, 0, r6, c0, c11, \Op2
> >> +	mrc	p14, 0, r5, c0, c10, \Op2
> >> +	mrc	p14, 0, r4, c0, c9, \Op2
> >> +	mrc	p14, 0, r3, c0, c8, \Op2
> >> +	push	{r3-r10}
> >
> >you probably don't want to do more stores to memory than required
> 
> Yeah, there is no need to push some registers, but I can't find a better 
> way to optimize it, is there any precedents that I can refer to?

Can you not simply do what you do below where you read the coproc
register and then do the store of that and so on?

If you unify the two approaches you should be in the clear on this one
anyway...

> 
> Imaging that there are only 2 hwbrpts available, BCR0/BCR1, and we
> can code like this:
> 
> Save:
> 		 jump to 1
>     save BCR2 to r5
> 1:
>     save BCR1 to r4
>     save BCR0 to r3
>     push {r3-r5}
> 
> Restore:
>     pop {r3-r5}
>     jump to 1
>     restore r5 to BCR2
> 1:
>     restore r4 to BCR1
>     restore r3 to BCR0
> 
> Buf if we want to only push the registers we acutally need, we have to
> code like this:
> 
> Save:
>     jump to 1
>     save BCR2 to r5
>     push r5
> 1:
>     save BCR1 to r4
>     push r4
>     save BCR0 to r3
>     push r3
> 
> Resotre:
>     jump to 1
>     pop r5
>     restore r5 to BCR2
> 1:
>     pop r4
>     restore r4 to BCR1
>     pop r3
>     restore r3 to BCR0
> 
> Then we might entercounter a mistake on restoring, as we want to pop
> r4, but actually we pop r3.
> 
> >
> >> +9999:
> >> +	mrc	p14, 0, r10, c0, c7, \Op2
> >> +	mrc	p14, 0, r9, c0, c6, \Op2
> >> +	mrc	p14, 0, r8, c0, c5, \Op2
> >> +	mrc	p14, 0, r7, c0, c4, \Op2
> >> +	mrc	p14, 0, r6, c0, c3, \Op2
> >> +	mrc	p14, 0, r5, c0, c2, \Op2
> >> +	mrc	p14, 0, r4, c0, c1, \Op2
> >> +	mrc	p14, 0, r3, c0, c0, \Op2
> >> +	push	{r3-r10}
> >
> >same
> >
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r10 */
> >> +.macro cp14_pop_and_write Op2 skip_num
> >> +	cmp	\skip_num, #8
> >> +	// if (skip_num >= 8) then skip c8-c15 directly
> >> +	bge	1f
> >> +	adr	r2, 9998f
> >> +	add	r2, r2, \skip_num, lsl #2
> >> +	pop	{r3-r10}
> >
> >you probably don't want to do more loads from memory than required
> >
> >> +	bx	r2
> >> +1:
> >> +	adr	r2, 9999f
> >> +	sub	r3, \skip_num, #8
> >> +	add	r2, r2, r3, lsl #2
> >> +	pop	{r3-r10}
> >
> >same
> >
> >> +	bx	r2
> >> +
> >> +9998:
> >> +	mcr	p14, 0, r10, c0, c15, \Op2
> >> +	mcr	p14, 0, r9, c0, c14, \Op2
> >> +	mcr	p14, 0, r8, c0, c13, \Op2
> >> +	mcr	p14, 0, r7, c0, c12, \Op2
> >> +	mcr	p14, 0, r6, c0, c11, \Op2
> >> +	mcr	p14, 0, r5, c0, c10, \Op2
> >> +	mcr	p14, 0, r4, c0, c9, \Op2
> >> +	mcr	p14, 0, r3, c0, c8, \Op2
> >> +
> >> +	pop	{r3-r10}
> >> +9999:
> >> +	mcr	p14, 0, r10, c0, c7, \Op2
> >> +	mcr	p14, 0, r9, c0, c6, \Op2
> >> +	mcr	p14, 0, r8, c0, c5, \Op2
> >> +	mcr	p14, 0, r7, c0, c4, \Op2
> >> +	mcr	p14, 0, r6, c0, c3, \Op2
> >> +	mcr	p14, 0, r5, c0, c2, \Op2
> >> +	mcr	p14, 0, r4, c0, c1, \Op2
> >> +	mcr	p14, 0, r3, c0, c0, \Op2
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r3 */
> >> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
> >> +	adr	r3, 1f
> >> +	add	r3, r3, \skip_num, lsl #3
> >> +	bx	r3
> >> +1:
> >> +	mrc	p14, 0, r2, c0, c15, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> >> +	mrc	p14, 0, r2, c0, c14, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> >> +	mrc	p14, 0, r2, c0, c13, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> >> +	mrc	p14, 0, r2, c0, c12, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> >> +	mrc	p14, 0, r2, c0, c11, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> >> +	mrc	p14, 0, r2, c0, c10, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> >> +	mrc	p14, 0, r2, c0, c9, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> >> +	mrc	p14, 0, r2, c0, c8, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> >> +	mrc	p14, 0, r2, c0, c7, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> >> +	mrc	p14, 0, r2, c0, c6, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> >> +	mrc	p14, 0, r2, c0, c5, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> >> +	mrc	p14, 0, r2, c0, c4, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> >> +	mrc	p14, 0, r2, c0, c3, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> >> +	mrc	p14, 0, r2, c0, c2, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> >> +	mrc	p14, 0, r2, c0, c1, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> >> +	mrc	p14, 0, r2, c0, c0, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r3 */
> >> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
> >> +	adr	r3, 1f
> >> +	add	r3, r3, \skip_num, lsl #3
> >> +	bx	r3
> >> +1:
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> >> +	mcr	p14, 0, r2, c0, c15, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> >> +	mcr	p14, 0, r2, c0, c14, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> >> +	mcr	p14, 0, r2, c0, c13, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> >> +	mcr	p14, 0, r2, c0, c12, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> >> +	mcr	p14, 0, r2, c0, c11, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> >> +	mcr	p14, 0, r2, c0, c10, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> >> +	mcr	p14, 0, r2, c0, c9, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> >> +	mcr	p14, 0, r2, c0, c8, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> >> +	mcr	p14, 0, r2, c0, c7, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> >> +	mcr	p14, 0, r2, c0, c6, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> >> +	mcr	p14, 0, r2, c0, c5, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> >> +	mcr	p14, 0, r2, c0, c4, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> >> +	mcr	p14, 0, r2, c0, c3, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> >> +	mcr	p14, 0, r2, c0, c2, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> >> +	mcr	p14, 0, r2, c0, c1, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> >> +	mcr	p14, 0, r2, c0, c0, \Op2
> >> +.endm
> >
> >can you not find some way of unifying cp14_pop_and_write with
> >cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?
> >
> >Probably having two separate structs for the VFP state on the vcpu
> >struct
> >for both the guest and the host state is one possible way of doing so.
> >
> 
> OK, I will do it.
> Would you like me to rename the struct vfp_hard_struct, and add 
> host_cp14_state in there, or add a new struct host_cp14_state in the
> kvm_vcpu_arch?
> 

Not sure I understand the question exactly.  I would probably define
kvm_cpu_context_t as a new struct that includes the VFP state instead of
it being a typedef, but I haven't looked at it too carefully.


> >> +
> >> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
> >> +.macro read_hw_dbg_num
> >> +	mrc	p14, 0, r2, c0, c0, 0
> >> +	ubfx	r11, r2, #24, #4
> >> +	add	r11, r11, #1		// Extract BRPs
> >> +	ubfx	r12, r2, #28, #4
> >> +	add	r12, r12, #1		// Extract WRPs
> >> +	mov	r2, #16
> >> +	sub	r11, r2, r11		// How many BPs to skip
> >> +	sub	r12, r2, r12		// How many WPs to skip
> >> +.endm
> >> +
> >> +/* Reads cp14 registers from hardware.
> >
> >You have a lot of multi-line comments in these patches which don't
> >start
> >with a separate '/*' line, as dictated by the Linux kernel coding
> >style.
> >So far, I've ignored this, but please fix all these throughout the
> >series when you respin.
> >
> >> + * Writes cp14 registers in-order to the stack.
> >> + *
> >> + * Assumes vcpu pointer in vcpu reg
> >> + *
> >> + * Clobbers r2-r12
> >> + */
> >> +.macro save_host_debug_regs
> >> +	read_hw_dbg_num
> >> +	cp14_read_and_push #4, r11	@ DBGBVR
> >> +	cp14_read_and_push #5, r11	@ DBGBCR
> >> +	cp14_read_and_push #6, r12	@ DBGWVR
> >> +	cp14_read_and_push #7, r12	@ DBGWCR
> >> +.endm
> >> +
> >> +/* Reads cp14 registers from hardware.
> >> + * Writes cp14 registers in-order to the VCPU struct pointed to by
> >vcpup.
> >> + *
> >> + * Assumes vcpu pointer in vcpu reg
> >> + *
> >> + * Clobbers r2-r12
> >> + */
> >> +.macro save_guest_debug_regs
> >> +	read_hw_dbg_num
> >> +	cp14_read_and_str #4, cp14_DBGBVR0, r11
> >
> >why do you need the has before the op2 field?
> 
> Sorry, I can't quite understand.
> 

heh, I meant hash, why is is '#4' instead of '4' ?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 09/11] KVM: arm: implement lazy world switch for debug registers
@ 2015-07-03 21:05         ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-03 21:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 03, 2015 at 06:06:48PM +0800, Zhichao Huang wrote:
> 
> 
> On June 30, 2015 9:15:22 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >On Mon, Jun 22, 2015 at 06:41:32PM +0800, Zhichao Huang wrote:
> >> Implement switching of the debug registers. While the number
> >> of registers is massive, CPUs usually don't implement them all
> >> (A15 has 6 breakpoints and 4 watchpoints, which gives us a total
> >> of 22 registers "only").
> >> 
> >> Notice that, for ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in
> >> the guest, debug is always actively in use (ARM_DSCR_MDBGEN set).
> >> 
> >> We have to do the save/restore dance in this case, because the host
> >> and the guest might use their respective debug registers at any moment.
> >
> >this sounds expensive, and I suggested an alternative approach in the
> >previsou patch.  In any case, measuring the impact on this on hardware
> >would be a great idea...
> >
> >> 
> >> If the CONFIG_HAVE_HW_BREAKPOINT is not set, and if no one flagged
> >> the debug registers as dirty, we only save/resotre DBGDSCR.
> >
> >restore
> >
> >> 
> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
> >> ---
> >>  arch/arm/kvm/interrupts.S      |  16 +++
> >>  arch/arm/kvm/interrupts_head.S | 249 ++++++++++++++++++++++++++++++++++++++++-
> >>  2 files changed, 263 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> >> index 79caf79..d626275 100644
> >> --- a/arch/arm/kvm/interrupts.S
> >> +++ b/arch/arm/kvm/interrupts.S
> >> @@ -116,6 +116,12 @@ ENTRY(__kvm_vcpu_run)
> >>  	read_cp15_state store_to_vcpu = 0
> >>  	write_cp15_state read_from_vcpu = 1
> >>  
> >> +	@ Store hardware CP14 state and load guest state
> >> +	compute_debug_state 1f
> >> +	bl __save_host_debug_regs
> >> +	bl __restore_guest_debug_regs
> >> +
> >> +1:
> >>  	@ If the host kernel has not been configured with VFPv3 support,
> >>  	@ then it is safer if we deny guests from using it as well.
> >>  #ifdef CONFIG_VFPv3
> >> @@ -201,6 +207,16 @@ after_vfp_restore:
> >>  	mrc	p15, 0, r2, c0, c0, 5
> >>  	mcr	p15, 4, r2, c0, c0, 5
> >>  
> >> +	@ Store guest CP14 state and restore host state
> >> +	skip_debug_state 1f
> >> +	bl __save_guest_debug_regs
> >> +	bl __restore_host_debug_regs
> >> +	/* Clear the dirty flag for the next run, as all the state has
> >> +	 * already been saved. Note that we nuke the whole 32bit word.
> >> +	 * If we ever add more flags, we'll have to be more careful...
> >> +	 */
> >> +	clear_debug_dirty_bit
> >> +1:
> >>  	@ Store guest CP15 state and restore host state
> >>  	read_cp15_state store_to_vcpu = 1
> >>  	write_cp15_state read_from_vcpu = 0
> >> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> >> index 5662c39..ed406be 100644
> >> --- a/arch/arm/kvm/interrupts_head.S
> >> +++ b/arch/arm/kvm/interrupts_head.S
> >> @@ -7,6 +7,7 @@
> >>  #define VCPU_USR_SP		(VCPU_USR_REG(13))
> >>  #define VCPU_USR_LR		(VCPU_USR_REG(14))
> >>  #define CP15_OFFSET(_cp15_reg_idx) (VCPU_CP15 + (_cp15_reg_idx * 4))
> >> +#define CP14_OFFSET(_cp14_reg_idx) (VCPU_CP14 + ((_cp14_reg_idx) *
> >4))
> >>  
> >>  /*
> >>   * Many of these macros need to access the VCPU structure, which is
> >always
> >> @@ -168,8 +169,7 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>   * Clobbers *all* registers.
> >>   */
> >>  .macro restore_guest_regs
> >> -	/* reset DBGDSCR to disable debug mode */
> >> -	mov	r2, #0
> >> +	ldr	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> >>  	mcr	p14, 0, r2, c0, c2, 2
> >>  
> >>  	restore_guest_regs_mode svc, #VCPU_SVC_REGS
> >> @@ -250,6 +250,10 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	save_guest_regs_mode abt, #VCPU_ABT_REGS
> >>  	save_guest_regs_mode und, #VCPU_UND_REGS
> >>  	save_guest_regs_mode irq, #VCPU_IRQ_REGS
> >> +
> >> +	/* DBGDSCR reg */
> >> +	mrc	p14, 0, r2, c0, c1, 0
> >> +	str	r2, [vcpu, #CP14_OFFSET(cp14_DBGDSCRext)]
> >>  .endm
> >>  
> >>  /* Reads cp15 registers from hardware and stores them in memory
> >> @@ -449,6 +453,231 @@ vcpu	.req	r0		@ vcpu pointer always in r0
> >>  	str	r5, [vcpu, #VCPU_DEBUG_FLAGS]
> >>  .endm
> >>  
> >> +/* Assume r11/r12 in used, clobbers r2-r10 */
> >> +.macro cp14_read_and_push Op2 skip_num
> >> +	cmp	\skip_num, #8
> >> +	// if (skip_num >= 8) then skip c8-c15 directly
> >> +	bge	1f
> >> +	adr	r2, 9998f
> >> +	add	r2, r2, \skip_num, lsl #2
> >> +	bx	r2
> >> +1:
> >> +	adr	r2, 9999f
> >> +	sub	r3, \skip_num, #8
> >> +	add	r2, r2, r3, lsl #2
> >> +	bx	r2
> >> +9998:
> >> +	mrc	p14, 0, r10, c0, c15, \Op2
> >> +	mrc	p14, 0, r9, c0, c14, \Op2
> >> +	mrc	p14, 0, r8, c0, c13, \Op2
> >> +	mrc	p14, 0, r7, c0, c12, \Op2
> >> +	mrc	p14, 0, r6, c0, c11, \Op2
> >> +	mrc	p14, 0, r5, c0, c10, \Op2
> >> +	mrc	p14, 0, r4, c0, c9, \Op2
> >> +	mrc	p14, 0, r3, c0, c8, \Op2
> >> +	push	{r3-r10}
> >
> >you probably don't want to do more stores to memory than required
> 
> Yeah, there is no need to push some registers, but I can't find a better 
> way to optimize it, is there any precedents that I can refer to?

Can you not simply do what you do below where you read the coproc
register and then do the store of that and so on?

If you unify the two approaches you should be in the clear on this one
anyway...

> 
> Imaging that there are only 2 hwbrpts available, BCR0/BCR1, and we
> can code like this:
> 
> Save:
> 		 jump to 1
>     save BCR2 to r5
> 1:
>     save BCR1 to r4
>     save BCR0 to r3
>     push {r3-r5}
> 
> Restore:
>     pop {r3-r5}
>     jump to 1
>     restore r5 to BCR2
> 1:
>     restore r4 to BCR1
>     restore r3 to BCR0
> 
> Buf if we want to only push the registers we acutally need, we have to
> code like this:
> 
> Save:
>     jump to 1
>     save BCR2 to r5
>     push r5
> 1:
>     save BCR1 to r4
>     push r4
>     save BCR0 to r3
>     push r3
> 
> Resotre:
>     jump to 1
>     pop r5
>     restore r5 to BCR2
> 1:
>     pop r4
>     restore r4 to BCR1
>     pop r3
>     restore r3 to BCR0
> 
> Then we might entercounter a mistake on restoring, as we want to pop
> r4, but actually we pop r3.
> 
> >
> >> +9999:
> >> +	mrc	p14, 0, r10, c0, c7, \Op2
> >> +	mrc	p14, 0, r9, c0, c6, \Op2
> >> +	mrc	p14, 0, r8, c0, c5, \Op2
> >> +	mrc	p14, 0, r7, c0, c4, \Op2
> >> +	mrc	p14, 0, r6, c0, c3, \Op2
> >> +	mrc	p14, 0, r5, c0, c2, \Op2
> >> +	mrc	p14, 0, r4, c0, c1, \Op2
> >> +	mrc	p14, 0, r3, c0, c0, \Op2
> >> +	push	{r3-r10}
> >
> >same
> >
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r10 */
> >> +.macro cp14_pop_and_write Op2 skip_num
> >> +	cmp	\skip_num, #8
> >> +	// if (skip_num >= 8) then skip c8-c15 directly
> >> +	bge	1f
> >> +	adr	r2, 9998f
> >> +	add	r2, r2, \skip_num, lsl #2
> >> +	pop	{r3-r10}
> >
> >you probably don't want to do more loads from memory than required
> >
> >> +	bx	r2
> >> +1:
> >> +	adr	r2, 9999f
> >> +	sub	r3, \skip_num, #8
> >> +	add	r2, r2, r3, lsl #2
> >> +	pop	{r3-r10}
> >
> >same
> >
> >> +	bx	r2
> >> +
> >> +9998:
> >> +	mcr	p14, 0, r10, c0, c15, \Op2
> >> +	mcr	p14, 0, r9, c0, c14, \Op2
> >> +	mcr	p14, 0, r8, c0, c13, \Op2
> >> +	mcr	p14, 0, r7, c0, c12, \Op2
> >> +	mcr	p14, 0, r6, c0, c11, \Op2
> >> +	mcr	p14, 0, r5, c0, c10, \Op2
> >> +	mcr	p14, 0, r4, c0, c9, \Op2
> >> +	mcr	p14, 0, r3, c0, c8, \Op2
> >> +
> >> +	pop	{r3-r10}
> >> +9999:
> >> +	mcr	p14, 0, r10, c0, c7, \Op2
> >> +	mcr	p14, 0, r9, c0, c6, \Op2
> >> +	mcr	p14, 0, r8, c0, c5, \Op2
> >> +	mcr	p14, 0, r7, c0, c4, \Op2
> >> +	mcr	p14, 0, r6, c0, c3, \Op2
> >> +	mcr	p14, 0, r5, c0, c2, \Op2
> >> +	mcr	p14, 0, r4, c0, c1, \Op2
> >> +	mcr	p14, 0, r3, c0, c0, \Op2
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r3 */
> >> +.macro cp14_read_and_str Op2 cp14_reg0 skip_num
> >> +	adr	r3, 1f
> >> +	add	r3, r3, \skip_num, lsl #3
> >> +	bx	r3
> >> +1:
> >> +	mrc	p14, 0, r2, c0, c15, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> >> +	mrc	p14, 0, r2, c0, c14, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> >> +	mrc	p14, 0, r2, c0, c13, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> >> +	mrc	p14, 0, r2, c0, c12, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> >> +	mrc	p14, 0, r2, c0, c11, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> >> +	mrc	p14, 0, r2, c0, c10, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> >> +	mrc	p14, 0, r2, c0, c9, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> >> +	mrc	p14, 0, r2, c0, c8, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> >> +	mrc	p14, 0, r2, c0, c7, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> >> +	mrc	p14, 0, r2, c0, c6, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> >> +	mrc	p14, 0, r2, c0, c5, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> >> +	mrc	p14, 0, r2, c0, c4, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> >> +	mrc	p14, 0, r2, c0, c3, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> >> +	mrc	p14, 0, r2, c0, c2, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> >> +	mrc	p14, 0, r2, c0, c1, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> >> +	mrc	p14, 0, r2, c0, c0, \Op2
> >> +	str     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> >> +.endm
> >> +
> >> +/* Assume r11/r12 in used, clobbers r2-r3 */
> >> +.macro cp14_ldr_and_write Op2 cp14_reg0 skip_num
> >> +	adr	r3, 1f
> >> +	add	r3, r3, \skip_num, lsl #3
> >> +	bx	r3
> >> +1:
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+15)]
> >> +	mcr	p14, 0, r2, c0, c15, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+14)]
> >> +	mcr	p14, 0, r2, c0, c14, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+13)]
> >> +	mcr	p14, 0, r2, c0, c13, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+12)]
> >> +	mcr	p14, 0, r2, c0, c12, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+11)]
> >> +	mcr	p14, 0, r2, c0, c11, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+10)]
> >> +	mcr	p14, 0, r2, c0, c10, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+9)]
> >> +	mcr	p14, 0, r2, c0, c9, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+8)]
> >> +	mcr	p14, 0, r2, c0, c8, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+7)]
> >> +	mcr	p14, 0, r2, c0, c7, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+6)]
> >> +	mcr	p14, 0, r2, c0, c6, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+5)]
> >> +	mcr	p14, 0, r2, c0, c5, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+4)]
> >> +	mcr	p14, 0, r2, c0, c4, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+3)]
> >> +	mcr	p14, 0, r2, c0, c3, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+2)]
> >> +	mcr	p14, 0, r2, c0, c2, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0+1)]
> >> +	mcr	p14, 0, r2, c0, c1, \Op2
> >> +	ldr     r2, [vcpu, #CP14_OFFSET(\cp14_reg0)]
> >> +	mcr	p14, 0, r2, c0, c0, \Op2
> >> +.endm
> >
> >can you not find some way of unifying cp14_pop_and_write with
> >cp14_ldr_and_write and cp14_read_and_push with cp14_read_and_str ?
> >
> >Probably having two separate structs for the VFP state on the vcpu
> >struct
> >for both the guest and the host state is one possible way of doing so.
> >
> 
> OK, I will do it.
> Would you like me to rename the struct vfp_hard_struct, and add 
> host_cp14_state in there, or add a new struct host_cp14_state in the
> kvm_vcpu_arch?
> 

Not sure I understand the question exactly.  I would probably define
kvm_cpu_context_t as a new struct that includes the VFP state instead of
it being a typedef, but I haven't looked at it too carefully.


> >> +
> >> +/* Get extract number of BRPs and WRPs. Saved in r11/r12 */
> >> +.macro read_hw_dbg_num
> >> +	mrc	p14, 0, r2, c0, c0, 0
> >> +	ubfx	r11, r2, #24, #4
> >> +	add	r11, r11, #1		// Extract BRPs
> >> +	ubfx	r12, r2, #28, #4
> >> +	add	r12, r12, #1		// Extract WRPs
> >> +	mov	r2, #16
> >> +	sub	r11, r2, r11		// How many BPs to skip
> >> +	sub	r12, r2, r12		// How many WPs to skip
> >> +.endm
> >> +
> >> +/* Reads cp14 registers from hardware.
> >
> >You have a lot of multi-line comments in these patches which don't
> >start
> >with a separate '/*' line, as dictated by the Linux kernel coding
> >style.
> >So far, I've ignored this, but please fix all these throughout the
> >series when you respin.
> >
> >> + * Writes cp14 registers in-order to the stack.
> >> + *
> >> + * Assumes vcpu pointer in vcpu reg
> >> + *
> >> + * Clobbers r2-r12
> >> + */
> >> +.macro save_host_debug_regs
> >> +	read_hw_dbg_num
> >> +	cp14_read_and_push #4, r11	@ DBGBVR
> >> +	cp14_read_and_push #5, r11	@ DBGBCR
> >> +	cp14_read_and_push #6, r12	@ DBGWVR
> >> +	cp14_read_and_push #7, r12	@ DBGWCR
> >> +.endm
> >> +
> >> +/* Reads cp14 registers from hardware.
> >> + * Writes cp14 registers in-order to the VCPU struct pointed to by
> >vcpup.
> >> + *
> >> + * Assumes vcpu pointer in vcpu reg
> >> + *
> >> + * Clobbers r2-r12
> >> + */
> >> +.macro save_guest_debug_regs
> >> +	read_hw_dbg_num
> >> +	cp14_read_and_str #4, cp14_DBGBVR0, r11
> >
> >why do you need the has before the op2 field?
> 
> Sorry, I can't quite understand.
> 

heh, I meant hash, why is is '#4' instead of '4' ?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-03 11:56         ` Christoffer Dall
@ 2015-07-07 10:06           ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-07 10:06 UTC (permalink / raw)
  To: will.deacon
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, alex.bennee,
	Christoffer Dall, huangzhichao

Hi, Will,

Chazy and me are talking about how to reduce the saving/restoring
overhead for debug registers.
We want to add a state in hw_breakpoint.c to indicate whether the host
enable any hwbrpts or not (might export a fuction that kvm can call),
then we can read this state from memory instead of reading from real
hardware registers, and to decide whether we need a world switch or
not.
Does it acceptable?


On July 3, 2015 7:56:11 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Fri, Jul 03, 2015 at 05:54:47PM +0800, Zhichao Huang wrote:
>> 
>> 
>> On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> >On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
>> >> The trapping code keeps track of the state of the debug registers,
>> >> allowing for the switch code to implement a lazy switching strategy.
>> >> 
>> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> >> ---
>> >>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>> >>  arch/arm/include/asm/kvm_host.h |  3 +++
>> >>  arch/arm/kernel/asm-offsets.c   |  1 +
>> >>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>> >>  arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
>> >>  5 files changed, 86 insertions(+), 2 deletions(-)
>> >> 
>> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> >> index ba65e05..4fb64cf 100644
>> >> --- a/arch/arm/include/asm/kvm_asm.h
>> >> +++ b/arch/arm/include/asm/kvm_asm.h
>> >> @@ -64,6 +64,9 @@
>> >>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>> >>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>> >>  
>> >> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
>> >> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
>> >> +
>> >>  #define ARM_EXCEPTION_RESET	  0
>> >>  #define ARM_EXCEPTION_UNDEFINED   1
>> >>  #define ARM_EXCEPTION_SOFTWARE    2
>> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> >> index 3d16820..09b54bf 100644
>> >> --- a/arch/arm/include/asm/kvm_host.h
>> >> +++ b/arch/arm/include/asm/kvm_host.h
>> >> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>> >>  	/* System control coprocessor (cp14) */
>> >>  	u32 cp14[NR_CP14_REGS];
>> >>  
>> >> +	/* Debug state */
>> >> +	u32 debug_flags;
>> >> +
>> >>  	/*
>> >>  	 * Anything that is not used directly from assembly code goes
>> >>  	 * here.
>> >> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> >> index 9158de0..e876109 100644
>> >> --- a/arch/arm/kernel/asm-offsets.c
>> >> +++ b/arch/arm/kernel/asm-offsets.c
>> >> @@ -185,6 +185,7 @@ int main(void)
>> >>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
>> >arch.regs.fiq_regs));
>> >>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
>> >arch.regs.usr_regs.ARM_pc));
>> >>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
>> >arch.regs.usr_regs.ARM_cpsr));
>> >> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
>> >arch.debug_flags));
>> >>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>> >>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>> >>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
>> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> >> index eeee648..fc0c2ef 100644
>> >> --- a/arch/arm/kvm/coproc.c
>> >> +++ b/arch/arm/kvm/coproc.c
>> >> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>> >>  	return true;
>> >>  }
>> >>  
>> >> +/*
>> >> + * We want to avoid world-switching all the DBG registers all the
>> >> + * time:
>> >> + *
>> >> + * - If we've touched any debug register, it is likely that we're
>> >> + *   going to touch more of them. It then makes sense to disable the
>> >> + *   traps and start doing the save/restore dance
>> >> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
>> >> + *   to save/restore the registers, as the guest depends on them.
>> >> + *
>> >> + * For this, we use a DIRTY bit, indicating the guest has modified the
>> >> + * debug registers, used as follow:
>> >> + *
>> >> + * On guest entry:
>> >> + * - If the dirty bit is set (because we're coming back from trapping),
>> >> + *   disable the traps, save host registers, restore guest registers.
>> >> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
>> >> + *   set the dirty bit, disable the traps, save host registers,
>> >> + *   restore guest registers.
>> >> + * - Otherwise, enable the traps
>> >> + *
>> >> + * On guest exit:
>> >> + * - If the dirty bit is set, save guest registers, restore host
>> >> + *   registers and clear the dirty bit. This ensure that the host can
>> >> + *   now use the debug registers.
>> >> + *
>> >> + * Notice:
>> >> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
>> >> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
>> >> + *   We have to do the save/restore dance in this case, because the
>> >> + *   host and the guest might use their respective debug registers
>> >> + *   at any moment.
>> >
>> >so doesn't this pretty much invalidate the whole saving/dirty effort?
>> >
>> >Guests configured from for example multi_v7_defconfig will then act
>> >like
>> >this and you will save/restore all these registers always.
>> >
>> >Wouldn't a better approach be to enable trapping to hyp mode most of
>> >the
>> >time, and simply clear the enabled bit of any host-used break- or
>> >wathcpoints upon guest entry, perhaps maintaining a bitmap of which
>> >ones
>> >must be re-set when exiting the guest, and thereby drastically
>reducing
>> >the amount of save/restore code you'd have to perform.
>> >
>> >Of course, you'd also have to keep track of whether the guest has any
>> >breakpoints or watchpoints enabled for when you do the full
>> >save/restore
>> >dance.
>> >
>> >That would also avoid all issues surrounding accesses to DBGDSCRext
>> >register I think.
>> 
>> I have thought about it, which means to say, "Since we can't find
>> whether the guest has any hwbrpts enabled from the DBGDSCR, why
>> don't we find it from the DBGBCR and DBGWCR?".
>> 
>> Case 1: The host and the guest enable all the hwbrpts.
>> It's necessary to world switch the debug registers all the time.
>> 
>> Case 2: The host and the guest enable some of the hwbrpts.
>> It's necessary to world switch the debug registers which are enabled.
>> But if we want skip thouse registers which aren't enabled, we have to
>> keep track of all the debug states both in the host and the guest.
>> We need to judge which debug registers we should switch, and which
>> not. It may bring in a complex logic in the assembly code. And if the
>> host or guest enabled almost all of the hwbrpts, this operation may
>> bring in the loss outweights the grain.
>> Is it acceptable and worthy? If yes, I will do it.
>> 
>> Case 3: Neither the host nor the guest enable any hwbrpts.
>> It's the case that we can skip the whole world switch thing.
>> The only problem is that we have to read all the debug registers on each
>> guest entry to find whether the host enable any hwbrpts or not.
>> But this case is a normal case, which is worthy to do the efforts to
>> reduce the saving/restoring overhead.
>> 
>I would never try to do a partial save/restore, just look at the
>control registers to see if anything is enabled as an indication of
>whether or not we need to do the save/restore of all the registers and
>disable trapping.
>
>Reading the guest control registers (from memory) only should be much
>faster than saving/restoring the whole lot.  Perhaps there's even a
>hook
>in Linux to figure out if any of the registers are being used?
>
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-07 10:06           ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-07 10:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi, Will,

Chazy and me are talking about how to reduce the saving/restoring
overhead for debug registers.
We want to add a state in hw_breakpoint.c to indicate whether the host
enable any hwbrpts or not (might export a fuction that kvm can call),
then we can read this state from memory instead of reading from real
hardware registers, and to decide whether we need a world switch or
not.
Does it acceptable?


On July 3, 2015 7:56:11 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>On Fri, Jul 03, 2015 at 05:54:47PM +0800, Zhichao Huang wrote:
>> 
>> 
>> On June 30, 2015 5:20:20 PM GMT+08:00, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> >On Mon, Jun 22, 2015 at 06:41:31PM +0800, Zhichao Huang wrote:
>> >> The trapping code keeps track of the state of the debug registers,
>> >> allowing for the switch code to implement a lazy switching strategy.
>> >> 
>> >> Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
>> >> ---
>> >>  arch/arm/include/asm/kvm_asm.h  |  3 +++
>> >>  arch/arm/include/asm/kvm_host.h |  3 +++
>> >>  arch/arm/kernel/asm-offsets.c   |  1 +
>> >>  arch/arm/kvm/coproc.c           | 39 ++++++++++++++++++++++++++++++++++++--
>> >>  arch/arm/kvm/interrupts_head.S  | 42 +++++++++++++++++++++++++++++++++++++++++
>> >>  5 files changed, 86 insertions(+), 2 deletions(-)
>> >> 
>> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> >> index ba65e05..4fb64cf 100644
>> >> --- a/arch/arm/include/asm/kvm_asm.h
>> >> +++ b/arch/arm/include/asm/kvm_asm.h
>> >> @@ -64,6 +64,9 @@
>> >>  #define cp14_DBGDSCRext	65	/* Debug Status and Control external */
>> >>  #define NR_CP14_REGS	66	/* Number of regs (incl. invalid) */
>> >>  
>> >> +#define KVM_ARM_DEBUG_DIRTY_SHIFT	0
>> >> +#define KVM_ARM_DEBUG_DIRTY		(1 << KVM_ARM_DEBUG_DIRTY_SHIFT)
>> >> +
>> >>  #define ARM_EXCEPTION_RESET	  0
>> >>  #define ARM_EXCEPTION_UNDEFINED   1
>> >>  #define ARM_EXCEPTION_SOFTWARE    2
>> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> >> index 3d16820..09b54bf 100644
>> >> --- a/arch/arm/include/asm/kvm_host.h
>> >> +++ b/arch/arm/include/asm/kvm_host.h
>> >> @@ -127,6 +127,9 @@ struct kvm_vcpu_arch {
>> >>  	/* System control coprocessor (cp14) */
>> >>  	u32 cp14[NR_CP14_REGS];
>> >>  
>> >> +	/* Debug state */
>> >> +	u32 debug_flags;
>> >> +
>> >>  	/*
>> >>  	 * Anything that is not used directly from assembly code goes
>> >>  	 * here.
>> >> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> >> index 9158de0..e876109 100644
>> >> --- a/arch/arm/kernel/asm-offsets.c
>> >> +++ b/arch/arm/kernel/asm-offsets.c
>> >> @@ -185,6 +185,7 @@ int main(void)
>> >>    DEFINE(VCPU_FIQ_REGS,		offsetof(struct kvm_vcpu,
>> >arch.regs.fiq_regs));
>> >>    DEFINE(VCPU_PC,		offsetof(struct kvm_vcpu,
>> >arch.regs.usr_regs.ARM_pc));
>> >>    DEFINE(VCPU_CPSR,		offsetof(struct kvm_vcpu,
>> >arch.regs.usr_regs.ARM_cpsr));
>> >> +  DEFINE(VCPU_DEBUG_FLAGS,	offsetof(struct kvm_vcpu,
>> >arch.debug_flags));
>> >>    DEFINE(VCPU_HCR,		offsetof(struct kvm_vcpu, arch.hcr));
>> >>    DEFINE(VCPU_IRQ_LINES,	offsetof(struct kvm_vcpu, arch.irq_lines));
>> >>    DEFINE(VCPU_HSR,		offsetof(struct kvm_vcpu, arch.fault.hsr));
>> >> diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
>> >> index eeee648..fc0c2ef 100644
>> >> --- a/arch/arm/kvm/coproc.c
>> >> +++ b/arch/arm/kvm/coproc.c
>> >> @@ -220,14 +220,49 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
>> >>  	return true;
>> >>  }
>> >>  
>> >> +/*
>> >> + * We want to avoid world-switching all the DBG registers all the
>> >> + * time:
>> >> + *
>> >> + * - If we've touched any debug register, it is likely that we're
>> >> + *   going to touch more of them. It then makes sense to disable the
>> >> + *   traps and start doing the save/restore dance
>> >> + * - If debug is active (ARM_DSCR_MDBGEN set), it is then mandatory
>> >> + *   to save/restore the registers, as the guest depends on them.
>> >> + *
>> >> + * For this, we use a DIRTY bit, indicating the guest has modified the
>> >> + * debug registers, used as follow:
>> >> + *
>> >> + * On guest entry:
>> >> + * - If the dirty bit is set (because we're coming back from trapping),
>> >> + *   disable the traps, save host registers, restore guest registers.
>> >> + * - If debug is actively in use (ARM_DSCR_MDBGEN set),
>> >> + *   set the dirty bit, disable the traps, save host registers,
>> >> + *   restore guest registers.
>> >> + * - Otherwise, enable the traps
>> >> + *
>> >> + * On guest exit:
>> >> + * - If the dirty bit is set, save guest registers, restore host
>> >> + *   registers and clear the dirty bit. This ensure that the host can
>> >> + *   now use the debug registers.
>> >> + *
>> >> + * Notice:
>> >> + * - For ARMv7, if the CONFIG_HAVE_HW_BREAKPOINT is set in the guest,
>> >> + *   debug is always actively in use (ARM_DSCR_MDBGEN set).
>> >> + *   We have to do the save/restore dance in this case, because the
>> >> + *   host and the guest might use their respective debug registers
>> >> + *   at any moment.
>> >
>> >so doesn't this pretty much invalidate the whole saving/dirty effort?
>> >
>> >Guests configured from for example multi_v7_defconfig will then act
>> >like
>> >this and you will save/restore all these registers always.
>> >
>> >Wouldn't a better approach be to enable trapping to hyp mode most of
>> >the
>> >time, and simply clear the enabled bit of any host-used break- or
>> >wathcpoints upon guest entry, perhaps maintaining a bitmap of which
>> >ones
>> >must be re-set when exiting the guest, and thereby drastically
>reducing
>> >the amount of save/restore code you'd have to perform.
>> >
>> >Of course, you'd also have to keep track of whether the guest has any
>> >breakpoints or watchpoints enabled for when you do the full
>> >save/restore
>> >dance.
>> >
>> >That would also avoid all issues surrounding accesses to DBGDSCRext
>> >register I think.
>> 
>> I have thought about it, which means to say, "Since we can't find
>> whether the guest has any hwbrpts enabled from the DBGDSCR, why
>> don't we find it from the DBGBCR and DBGWCR?".
>> 
>> Case 1: The host and the guest enable all the hwbrpts.
>> It's necessary to world switch the debug registers all the time.
>> 
>> Case 2: The host and the guest enable some of the hwbrpts.
>> It's necessary to world switch the debug registers which are enabled.
>> But if we want skip thouse registers which aren't enabled, we have to
>> keep track of all the debug states both in the host and the guest.
>> We need to judge which debug registers we should switch, and which
>> not. It may bring in a complex logic in the assembly code. And if the
>> host or guest enabled almost all of the hwbrpts, this operation may
>> bring in the loss outweights the grain.
>> Is it acceptable and worthy? If yes, I will do it.
>> 
>> Case 3: Neither the host nor the guest enable any hwbrpts.
>> It's the case that we can skip the whole world switch thing.
>> The only problem is that we have to read all the debug registers on each
>> guest entry to find whether the host enable any hwbrpts or not.
>> But this case is a normal case, which is worthy to do the efforts to
>> reduce the saving/restoring overhead.
>> 
>I would never try to do a partial save/restore, just look at the
>control registers to see if anything is enabled as an indication of
>whether or not we need to do the save/restore of all the registers and
>disable trapping.
>
>Reading the guest control registers (from memory) only should be much
>faster than saving/restoring the whole lot.  Perhaps there's even a
>hook
>in Linux to figure out if any of the registers are being used?
>
>Thanks,
>-Christoffer

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-07 10:06           ` Zhichao Huang
@ 2015-07-07 10:24             ` Will Deacon
  -1 siblings, 0 replies; 82+ messages in thread
From: Will Deacon @ 2015-07-07 10:24 UTC (permalink / raw)
  To: Zhichao Huang
  Cc: kvm, linux-arm-kernel, kvmarm, Marc Zyngier, alex.bennee,
	Christoffer Dall, huangzhichao

On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
> Chazy and me are talking about how to reduce the saving/restoring
> overhead for debug registers.
> We want to add a state in hw_breakpoint.c to indicate whether the host
> enable any hwbrpts or not (might export a fuction that kvm can call),
> then we can read this state from memory instead of reading from real
> hardware registers, and to decide whether we need a world switch or
> not.
> Does it acceptable?

Maybe, hard to tell without the code. There are obvious races to deal with
if you use variables to indicate whether resources are in use -- why not
just trap debug access from the host as well? Then you could keep track of
the "owner" in kvm and trap accesses from everybody else.

Will

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-07 10:24             ` Will Deacon
  0 siblings, 0 replies; 82+ messages in thread
From: Will Deacon @ 2015-07-07 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
> Chazy and me are talking about how to reduce the saving/restoring
> overhead for debug registers.
> We want to add a state in hw_breakpoint.c to indicate whether the host
> enable any hwbrpts or not (might export a fuction that kvm can call),
> then we can read this state from memory instead of reading from real
> hardware registers, and to decide whether we need a world switch or
> not.
> Does it acceptable?

Maybe, hard to tell without the code. There are obvious races to deal with
if you use variables to indicate whether resources are in use -- why not
just trap debug access from the host as well? Then you could keep track of
the "owner" in kvm and trap accesses from everybody else.

Will

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-07 10:24             ` Will Deacon
@ 2015-07-08 10:50               ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-08 10:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvm, linux-arm-kernel, kvmarm, Marc Zyngier, alex.bennee,
	Christoffer Dall, huangzhichao

Hi, Will,

Are you happy with this?:

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c

+bool hw_breakpoint_enabled(void)
+{
+    struct perf_event **slots;
+    int i;
+
+    slots = this_cpu_ptr(bp_on_reg);
+    for (i = 0; i < core_num_brps; i++) {
+        if (slots[i])
+            return true;
+    }
+
+    slots = this_cpu_ptr(wp_on_reg);
+    for (i = 0; i < core_num_wrps; i++) {
+        if (slots[i])
+            return true;
+    }
+
+    return false;
+}

It doesn't change any existing functions, and even doesn't add a new
variables, it just provide an indication for KVM, and it's low-overhead.

We will only call it upon guest entry, so there is also no race for it.


On July 7, 2015 6:24:06 PM GMT+08:00, Will Deacon <will.deacon@arm.com> wrote:
>On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
>> Chazy and me are talking about how to reduce the saving/restoring
>> overhead for debug registers.
>> We want to add a state in hw_breakpoint.c to indicate whether the
>host
>> enable any hwbrpts or not (might export a fuction that kvm can call),
>> then we can read this state from memory instead of reading from real
>> hardware registers, and to decide whether we need a world switch or
>> not.
>> Does it acceptable?
>
>Maybe, hard to tell without the code. There are obvious races to deal
>with
>if you use variables to indicate whether resources are in use -- why
>not
>just trap debug access from the host as well? Then you could keep track
>of
>the "owner" in kvm and trap accesses from everybody else.
>
>Will

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-08 10:50               ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-08 10:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi, Will,

Are you happy with this?:

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c

+bool hw_breakpoint_enabled(void)
+{
+    struct perf_event **slots;
+    int i;
+
+    slots = this_cpu_ptr(bp_on_reg);
+    for (i = 0; i < core_num_brps; i++) {
+        if (slots[i])
+            return true;
+    }
+
+    slots = this_cpu_ptr(wp_on_reg);
+    for (i = 0; i < core_num_wrps; i++) {
+        if (slots[i])
+            return true;
+    }
+
+    return false;
+}

It doesn't change any existing functions, and even doesn't add a new
variables, it just provide an indication for KVM, and it's low-overhead.

We will only call it upon guest entry, so there is also no race for it.


On July 7, 2015 6:24:06 PM GMT+08:00, Will Deacon <will.deacon@arm.com> wrote:
>On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
>> Chazy and me are talking about how to reduce the saving/restoring
>> overhead for debug registers.
>> We want to add a state in hw_breakpoint.c to indicate whether the
>host
>> enable any hwbrpts or not (might export a fuction that kvm can call),
>> then we can read this state from memory instead of reading from real
>> hardware registers, and to decide whether we need a world switch or
>> not.
>> Does it acceptable?
>
>Maybe, hard to tell without the code. There are obvious races to deal
>with
>if you use variables to indicate whether resources are in use -- why
>not
>just trap debug access from the host as well? Then you could keep track
>of
>the "owner" in kvm and trap accesses from everybody else.
>
>Will

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-08 10:50               ` Zhichao Huang
@ 2015-07-08 17:08                 ` Will Deacon
  -1 siblings, 0 replies; 82+ messages in thread
From: Will Deacon @ 2015-07-08 17:08 UTC (permalink / raw)
  To: Zhichao Huang; +Cc: kvm, Marc Zyngier, huangzhichao, linux-arm-kernel, kvmarm

On Wed, Jul 08, 2015 at 11:50:22AM +0100, Zhichao Huang wrote:
> Are you happy with this?:

You miss the reserved breakpoint, I think.
I also still don't understand why this is preferable to trapping.

Will

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-08 17:08                 ` Will Deacon
  0 siblings, 0 replies; 82+ messages in thread
From: Will Deacon @ 2015-07-08 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 08, 2015 at 11:50:22AM +0100, Zhichao Huang wrote:
> Are you happy with this?:

You miss the reserved breakpoint, I think.
I also still don't understand why this is preferable to trapping.

Will

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-07 10:24             ` Will Deacon
@ 2015-07-09 11:50               ` Christoffer Dall
  -1 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-09 11:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvm, Marc Zyngier, huangzhichao, Zhichao Huang, kvmarm, linux-arm-kernel

On Tue, Jul 07, 2015 at 11:24:06AM +0100, Will Deacon wrote:
> On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
> > Chazy and me are talking about how to reduce the saving/restoring
> > overhead for debug registers.
> > We want to add a state in hw_breakpoint.c to indicate whether the host
> > enable any hwbrpts or not (might export a fuction that kvm can call),
> > then we can read this state from memory instead of reading from real
> > hardware registers, and to decide whether we need a world switch or
> > not.
> > Does it acceptable?
> 
> Maybe, hard to tell without the code. There are obvious races to deal with
> if you use variables to indicate whether resources are in use -- why not
> just trap debug access from the host as well? Then you could keep track of
> the "owner" in kvm and trap accesses from everybody else.
> 
The only information we're looking for here is whether the host has
enabled some break/watch point so that we need to disable them before
running the guest.

Just to re-iterate, when we are about to run a guest, we have the
following cases:

1) Neither the host nor the guest has configured any [WB]points
2) Only the host has configured any [WB]points
3) Only the guest has configured any [WB]points
4) Both the host and the guest have configured any [WB]points

In case (1), KVM should enable trapping and swtich the register state on
guest accesses.

In cases (2), (3), and (4) we must switch the register state on each
entry/exit.

If we are to trap debug register accesses in KVM to set a flag to keep
track of the owner (iow. has the host touched the registers) then don't
we impose an ordering requirement of whether KVM or the breakpoint
functionality gets initialized first, and we need to take special care
when tearing down KVM to disable the traps?  It sounds a little complex.

I've previously suggested to simply look at the B/W control registers to
figure out what to do.  Caching the state in memory is an optimization,
do we even have any idea how important such an optimization is?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-09 11:50               ` Christoffer Dall
  0 siblings, 0 replies; 82+ messages in thread
From: Christoffer Dall @ 2015-07-09 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 07, 2015 at 11:24:06AM +0100, Will Deacon wrote:
> On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
> > Chazy and me are talking about how to reduce the saving/restoring
> > overhead for debug registers.
> > We want to add a state in hw_breakpoint.c to indicate whether the host
> > enable any hwbrpts or not (might export a fuction that kvm can call),
> > then we can read this state from memory instead of reading from real
> > hardware registers, and to decide whether we need a world switch or
> > not.
> > Does it acceptable?
> 
> Maybe, hard to tell without the code. There are obvious races to deal with
> if you use variables to indicate whether resources are in use -- why not
> just trap debug access from the host as well? Then you could keep track of
> the "owner" in kvm and trap accesses from everybody else.
> 
The only information we're looking for here is whether the host has
enabled some break/watch point so that we need to disable them before
running the guest.

Just to re-iterate, when we are about to run a guest, we have the
following cases:

1) Neither the host nor the guest has configured any [WB]points
2) Only the host has configured any [WB]points
3) Only the guest has configured any [WB]points
4) Both the host and the guest have configured any [WB]points

In case (1), KVM should enable trapping and swtich the register state on
guest accesses.

In cases (2), (3), and (4) we must switch the register state on each
entry/exit.

If we are to trap debug register accesses in KVM to set a flag to keep
track of the owner (iow. has the host touched the registers) then don't
we impose an ordering requirement of whether KVM or the breakpoint
functionality gets initialized first, and we need to take special care
when tearing down KVM to disable the traps?  It sounds a little complex.

I've previously suggested to simply look at the B/W control registers to
figure out what to do.  Caching the state in memory is an optimization,
do we even have any idea how important such an optimization is?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-08 17:08                 ` Will Deacon
@ 2015-07-09 12:54                   ` Zhichao Huang
  -1 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-09 12:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvm, linux-arm-kernel, kvmarm, Marc Zyngier, alex.bennee,
	Christoffer Dall, huangzhichao



On July 9, 2015 1:08:55 AM GMT+08:00, Will Deacon <will.deacon@arm.com> wrote:
>On Wed, Jul 08, 2015 at 11:50:22AM +0100, Zhichao Huang wrote:
>> Are you happy with this?:
>
>You miss the reserved breakpoint, I think.

Sorry, I can't quite understand. What is the reserved breakpoint?

When arch_[iu]ninstall_hw_breakpoint is called, it only set/clear the
brps and wrps slots.

>I also still don't understand why this is preferable to trapping.
>
>Will

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-09 12:54                   ` Zhichao Huang
  0 siblings, 0 replies; 82+ messages in thread
From: Zhichao Huang @ 2015-07-09 12:54 UTC (permalink / raw)
  To: linux-arm-kernel



On July 9, 2015 1:08:55 AM GMT+08:00, Will Deacon <will.deacon@arm.com> wrote:
>On Wed, Jul 08, 2015 at 11:50:22AM +0100, Zhichao Huang wrote:
>> Are you happy with this?:
>
>You miss the reserved breakpoint, I think.

Sorry, I can't quite understand. What is the reserved breakpoint?

When arch_[iu]ninstall_hw_breakpoint is called, it only set/clear the
brps and wrps slots.

>I also still don't understand why this is preferable to trapping.
>
>Will

-- 
Zhichao Huang

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
  2015-07-09 11:50               ` Christoffer Dall
@ 2015-07-13 12:12                 ` zichao
  -1 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-13 12:12 UTC (permalink / raw)
  To: Christoffer Dall, Will Deacon
  Cc: kvm, linux-arm-kernel, kvmarm, Marc Zyngier, alex.bennee, huangzhichao


On 2015/7/9 19:50, Christoffer Dall wrote:
> On Tue, Jul 07, 2015 at 11:24:06AM +0100, Will Deacon wrote:
>> On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
>>> Chazy and me are talking about how to reduce the saving/restoring
>>> overhead for debug registers.
>>> We want to add a state in hw_breakpoint.c to indicate whether the host
>>> enable any hwbrpts or not (might export a fuction that kvm can call),
>>> then we can read this state from memory instead of reading from real
>>> hardware registers, and to decide whether we need a world switch or
>>> not.
>>> Does it acceptable?
>>
>> Maybe, hard to tell without the code. There are obvious races to deal with
>> if you use variables to indicate whether resources are in use -- why not
>> just trap debug access from the host as well? Then you could keep track of
>> the "owner" in kvm and trap accesses from everybody else.
>>
> The only information we're looking for here is whether the host has
> enabled some break/watch point so that we need to disable them before
> running the guest.
> 
> Just to re-iterate, when we are about to run a guest, we have the
> following cases:
> 
> 1) Neither the host nor the guest has configured any [WB]points
> 2) Only the host has configured any [WB]points
> 3) Only the guest has configured any [WB]points
> 4) Both the host and the guest have configured any [WB]points
> 
> In case (1), KVM should enable trapping and swtich the register state on
> guest accesses.
> 
> In cases (2), (3), and (4) we must switch the register state on each
> entry/exit.
> 
> If we are to trap debug register accesses in KVM to set a flag to keep
> track of the owner (iow. has the host touched the registers) then don't
> we impose an ordering requirement of whether KVM or the breakpoint
> functionality gets initialized first, and we need to take special care
> when tearing down KVM to disable the traps?  It sounds a little complex.
> 
> I've previously suggested to simply look at the B/W control registers to
> figure out what to do.  Caching the state in memory is an optimization,
> do we even have any idea how important such an optimization is?
> 

I have a test for the overhead both in el1 and el2 on D01 board(ARMv7).

Each "MRC p14 ..." instruction cost 8 cycles, and Each "MCR p14 ..." cost 5 cycles.

A15 has 6 breakpoints and 4 watchpoints, which gives us a total of 20 registers.
and the overhead in each world switch is at least (20*8 + 20*5 = 260)cycles.


> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers
@ 2015-07-13 12:12                 ` zichao
  0 siblings, 0 replies; 82+ messages in thread
From: zichao @ 2015-07-13 12:12 UTC (permalink / raw)
  To: linux-arm-kernel


On 2015/7/9 19:50, Christoffer Dall wrote:
> On Tue, Jul 07, 2015 at 11:24:06AM +0100, Will Deacon wrote:
>> On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
>>> Chazy and me are talking about how to reduce the saving/restoring
>>> overhead for debug registers.
>>> We want to add a state in hw_breakpoint.c to indicate whether the host
>>> enable any hwbrpts or not (might export a fuction that kvm can call),
>>> then we can read this state from memory instead of reading from real
>>> hardware registers, and to decide whether we need a world switch or
>>> not.
>>> Does it acceptable?
>>
>> Maybe, hard to tell without the code. There are obvious races to deal with
>> if you use variables to indicate whether resources are in use -- why not
>> just trap debug access from the host as well? Then you could keep track of
>> the "owner" in kvm and trap accesses from everybody else.
>>
> The only information we're looking for here is whether the host has
> enabled some break/watch point so that we need to disable them before
> running the guest.
> 
> Just to re-iterate, when we are about to run a guest, we have the
> following cases:
> 
> 1) Neither the host nor the guest has configured any [WB]points
> 2) Only the host has configured any [WB]points
> 3) Only the guest has configured any [WB]points
> 4) Both the host and the guest have configured any [WB]points
> 
> In case (1), KVM should enable trapping and swtich the register state on
> guest accesses.
> 
> In cases (2), (3), and (4) we must switch the register state on each
> entry/exit.
> 
> If we are to trap debug register accesses in KVM to set a flag to keep
> track of the owner (iow. has the host touched the registers) then don't
> we impose an ordering requirement of whether KVM or the breakpoint
> functionality gets initialized first, and we need to take special care
> when tearing down KVM to disable the traps?  It sounds a little complex.
> 
> I've previously suggested to simply look at the B/W control registers to
> figure out what to do.  Caching the state in memory is an optimization,
> do we even have any idea how important such an optimization is?
> 

I have a test for the overhead both in el1 and el2 on D01 board(ARMv7).

Each "MRC p14 ..." instruction cost 8 cycles, and Each "MCR p14 ..." cost 5 cycles.

A15 has 6 breakpoints and 4 watchpoints, which gives us a total of 20 registers.
and the overhead in each world switch is at least (20*8 + 20*5 = 260)cycles.


> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2015-07-13 12:12 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-22 10:41 [PATCH v3 00/11] KVM: arm: debug infrastructure support Zhichao Huang
2015-06-22 10:41 ` Zhichao Huang
2015-06-22 10:41 ` [PATCH v3 01/11] KVM: arm: plug guest debug exploit Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-29 15:49   ` Christoffer Dall
2015-06-29 15:49     ` Christoffer Dall
2015-07-01  7:04     ` zichao
2015-07-01  7:04       ` zichao
2015-07-01  9:00       ` Christoffer Dall
2015-07-01  9:00         ` Christoffer Dall
2015-07-01  9:00         ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 02/11] KVM: arm: rename pm_fake handler to trap_raz_wi Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:20   ` Christoffer Dall
2015-06-30 13:20     ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 03/11] KVM: arm: enable to use the ARM_DSCR_MDBGEN macro from KVM assembly code Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:20   ` Christoffer Dall
2015-06-30 13:20     ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15 Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-29 19:43   ` Christoffer Dall
2015-06-29 19:43     ` Christoffer Dall
2015-07-01  7:09     ` zichao
2015-07-01  7:09       ` zichao
2015-07-01  9:00       ` Christoffer Dall
2015-07-01  9:00         ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 05/11] KVM: arm: check ordering of all system register tables Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:20   ` Christoffer Dall
2015-06-30 13:20     ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-29 21:16   ` Christoffer Dall
2015-06-29 21:16     ` Christoffer Dall
2015-07-01  7:14     ` zichao
2015-07-01  7:14       ` zichao
2015-06-22 10:41 ` [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit " Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:20   ` Christoffer Dall
2015-06-30 13:20     ` Christoffer Dall
2015-07-01  7:43     ` Zhichao Huang
2015-07-01  7:43       ` Zhichao Huang
2015-06-22 10:41 ` [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for " Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30  9:20   ` Christoffer Dall
2015-06-30  9:20     ` Christoffer Dall
2015-07-03  9:54     ` Zhichao Huang
2015-07-03  9:54       ` Zhichao Huang
2015-07-03 11:56       ` Christoffer Dall
2015-07-03 11:56         ` Christoffer Dall
2015-07-07 10:06         ` Zhichao Huang
2015-07-07 10:06           ` Zhichao Huang
2015-07-07 10:24           ` Will Deacon
2015-07-07 10:24             ` Will Deacon
2015-07-08 10:50             ` Zhichao Huang
2015-07-08 10:50               ` Zhichao Huang
2015-07-08 17:08               ` Will Deacon
2015-07-08 17:08                 ` Will Deacon
2015-07-09 12:54                 ` Zhichao Huang
2015-07-09 12:54                   ` Zhichao Huang
2015-07-09 11:50             ` Christoffer Dall
2015-07-09 11:50               ` Christoffer Dall
2015-07-13 12:12               ` zichao
2015-07-13 12:12                 ` zichao
2015-06-22 10:41 ` [PATCH v3 09/11] KVM: arm: implement lazy world switch " Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:15   ` Christoffer Dall
2015-06-30 13:15     ` Christoffer Dall
2015-07-03 10:06     ` Zhichao Huang
2015-07-03 10:06       ` Zhichao Huang
2015-07-03 21:05       ` Christoffer Dall
2015-07-03 21:05         ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 10/11] KVM: arm: add a trace event for cp14 traps Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:20   ` Christoffer Dall
2015-06-30 13:20     ` Christoffer Dall
2015-06-22 10:41 ` [PATCH v3 11/11] KVM: arm: enable trapping of all debug registers Zhichao Huang
2015-06-22 10:41   ` Zhichao Huang
2015-06-30 13:19   ` Christoffer Dall
2015-06-30 13:19     ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.