All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM/arm64 fixes for 3.11
@ 2013-07-19 13:53 ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: christoffer.dall, catalin.marinas, will.deacon

This series contains a small number of patches for KVM/arm64. The
first two are the arm64 equivalent of AArch32 fixes that have already
been merged, and the next two are more related to performance
optimisation and code cleanup.

The patches are also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/fixes-3.11-rc1

Marc Zyngier (4):
  arm64: KVM: perform save/restore of PAR_EL1
  arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
  arm64: KVM: let other tasks run when hitting WFE
  arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S

 arch/arm64/include/asm/kvm_arm.h |  6 ++++--
 arch/arm64/include/asm/kvm_asm.h | 23 ++++++++++++++---------
 arch/arm64/include/asm/virt.h    |  4 ++++
 arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
 arch/arm64/kvm/hyp.S             | 18 ++++++++++++------
 arch/arm64/kvm/sys_regs.c        |  3 +++
 6 files changed, 50 insertions(+), 22 deletions(-)

-- 
1.8.2.3



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 0/4] KVM/arm64 fixes for 3.11
@ 2013-07-19 13:53 ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

This series contains a small number of patches for KVM/arm64. The
first two are the arm64 equivalent of AArch32 fixes that have already
been merged, and the next two are more related to performance
optimisation and code cleanup.

The patches are also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/fixes-3.11-rc1

Marc Zyngier (4):
  arm64: KVM: perform save/restore of PAR_EL1
  arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
  arm64: KVM: let other tasks run when hitting WFE
  arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S

 arch/arm64/include/asm/kvm_arm.h |  6 ++++--
 arch/arm64/include/asm/kvm_asm.h | 23 ++++++++++++++---------
 arch/arm64/include/asm/virt.h    |  4 ++++
 arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
 arch/arm64/kvm/hyp.S             | 18 ++++++++++++------
 arch/arm64/kvm/sys_regs.c        |  3 +++
 6 files changed, 50 insertions(+), 22 deletions(-)

-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/4] arm64: KVM: perform save/restore of PAR_EL1
  2013-07-19 13:53 ` Marc Zyngier
@ 2013-07-19 13:53   ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: christoffer.dall, catalin.marinas, will.deacon

Not saving PAR_EL1 is an unfortunate oversight. If the guest
performs an AT* operation and gets scheduled out before reading
the result of the translation from PAREL1, it could become
corrupted by another guest or the host.

Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h | 17 ++++++++++-------
 arch/arm64/kvm/hyp.S             | 10 ++++++++++
 arch/arm64/kvm/sys_regs.c        |  3 +++
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index c92de41..b25763b 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -42,14 +42,15 @@
 #define	TPIDR_EL1	18	/* Thread ID, Privileged */
 #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
 #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
+#define	PAR_EL1		21	/* Physical Address Register */
 /* 32bit specific registers. Keep them at the end of the range */
-#define	DACR32_EL2	21	/* Domain Access Control Register */
-#define	IFSR32_EL2	22	/* Instruction Fault Status Register */
-#define	FPEXC32_EL2	23	/* Floating-Point Exception Control Register */
-#define	DBGVCR32_EL2	24	/* Debug Vector Catch Register */
-#define	TEECR32_EL1	25	/* ThumbEE Configuration Register */
-#define	TEEHBR32_EL1	26	/* ThumbEE Handler Base Register */
-#define	NR_SYS_REGS	27
+#define	DACR32_EL2	22	/* Domain Access Control Register */
+#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
+#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
+#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
+#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
+#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
+#define	NR_SYS_REGS	28
 
 /* 32bit mapping */
 #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
@@ -69,6 +70,8 @@
 #define c5_AIFSR	(AFSR1_EL1 * 2)	/* Auxiliary Instr Fault Status R */
 #define c6_DFAR		(FAR_EL1 * 2)	/* Data Fault Address Register */
 #define c6_IFAR		(c6_DFAR + 1)	/* Instruction Fault Address Register */
+#define c7_PAR		(PAR_EL1 * 2)	/* Physical Address Register */
+#define c7_PAR_high	(c7_PAR + 1)	/* PAR top 32 bits */
 #define c10_PRRR	(MAIR_EL1 * 2)	/* Primary Region Remap Register */
 #define c10_NMRR	(c10_PRRR + 1)	/* Normal Memory Remap Register */
 #define c12_VBAR	(VBAR_EL1 * 2)	/* Vector Base Address Register */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index ff985e3..218802f 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -214,6 +214,7 @@ __kvm_hyp_code_start:
 	mrs	x21,	tpidr_el1
 	mrs	x22, 	amair_el1
 	mrs	x23, 	cntkctl_el1
+	mrs	x24,	par_el1
 
 	stp	x4, x5, [x3]
 	stp	x6, x7, [x3, #16]
@@ -225,6 +226,7 @@ __kvm_hyp_code_start:
 	stp	x18, x19, [x3, #112]
 	stp	x20, x21, [x3, #128]
 	stp	x22, x23, [x3, #144]
+	str	x24, [x3, #160]
 .endm
 
 .macro restore_sysregs
@@ -243,6 +245,7 @@ __kvm_hyp_code_start:
 	ldp	x18, x19, [x3, #112]
 	ldp	x20, x21, [x3, #128]
 	ldp	x22, x23, [x3, #144]
+	ldr	x24, [x3, #160]
 
 	msr	vmpidr_el2,	x4
 	msr	csselr_el1,	x5
@@ -264,6 +267,7 @@ __kvm_hyp_code_start:
 	msr	tpidr_el1,	x21
 	msr	amair_el1,	x22
 	msr	cntkctl_el1,	x23
+	msr	par_el1,	x24
 .endm
 
 .macro skip_32bit_state tmp, target
@@ -753,6 +757,10 @@ el1_trap:
 	 */
 	tbnz	x1, #7, 1f	// S1PTW is set
 
+	/* Preserve PAR_EL1 */
+	mrs	x3, par_el1
+	push	x3, xzr
+
 	/*
 	 * Permission fault, HPFAR_EL2 is invalid.
 	 * Resolve the IPA the hard way using the guest VA.
@@ -766,6 +774,8 @@ el1_trap:
 
 	/* Read result */
 	mrs	x3, par_el1
+	pop	x0, xzr			// Restore PAR_EL1 from the stack
+	msr	par_el1, x0
 	tbnz	x3, #0, 3f		// Bail out if we failed the translation
 	ubfx	x3, x3, #12, #36	// Extract IPA
 	lsl	x3, x3, #4		// and present it like HPFAR
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9492360..02e9d09 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -211,6 +211,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	/* FAR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
 	  NULL, reset_unknown, FAR_EL1 },
+	/* PAR_EL1 */
+	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
+	  NULL, reset_unknown, PAR_EL1 },
 
 	/* PMINTENSET_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
-- 
1.8.2.3



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 1/4] arm64: KVM: perform save/restore of PAR_EL1
@ 2013-07-19 13:53   ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

Not saving PAR_EL1 is an unfortunate oversight. If the guest
performs an AT* operation and gets scheduled out before reading
the result of the translation from PAREL1, it could become
corrupted by another guest or the host.

Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h | 17 ++++++++++-------
 arch/arm64/kvm/hyp.S             | 10 ++++++++++
 arch/arm64/kvm/sys_regs.c        |  3 +++
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index c92de41..b25763b 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -42,14 +42,15 @@
 #define	TPIDR_EL1	18	/* Thread ID, Privileged */
 #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
 #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
+#define	PAR_EL1		21	/* Physical Address Register */
 /* 32bit specific registers. Keep them at the end of the range */
-#define	DACR32_EL2	21	/* Domain Access Control Register */
-#define	IFSR32_EL2	22	/* Instruction Fault Status Register */
-#define	FPEXC32_EL2	23	/* Floating-Point Exception Control Register */
-#define	DBGVCR32_EL2	24	/* Debug Vector Catch Register */
-#define	TEECR32_EL1	25	/* ThumbEE Configuration Register */
-#define	TEEHBR32_EL1	26	/* ThumbEE Handler Base Register */
-#define	NR_SYS_REGS	27
+#define	DACR32_EL2	22	/* Domain Access Control Register */
+#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
+#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
+#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
+#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
+#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
+#define	NR_SYS_REGS	28
 
 /* 32bit mapping */
 #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
@@ -69,6 +70,8 @@
 #define c5_AIFSR	(AFSR1_EL1 * 2)	/* Auxiliary Instr Fault Status R */
 #define c6_DFAR		(FAR_EL1 * 2)	/* Data Fault Address Register */
 #define c6_IFAR		(c6_DFAR + 1)	/* Instruction Fault Address Register */
+#define c7_PAR		(PAR_EL1 * 2)	/* Physical Address Register */
+#define c7_PAR_high	(c7_PAR + 1)	/* PAR top 32 bits */
 #define c10_PRRR	(MAIR_EL1 * 2)	/* Primary Region Remap Register */
 #define c10_NMRR	(c10_PRRR + 1)	/* Normal Memory Remap Register */
 #define c12_VBAR	(VBAR_EL1 * 2)	/* Vector Base Address Register */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index ff985e3..218802f 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -214,6 +214,7 @@ __kvm_hyp_code_start:
 	mrs	x21,	tpidr_el1
 	mrs	x22, 	amair_el1
 	mrs	x23, 	cntkctl_el1
+	mrs	x24,	par_el1
 
 	stp	x4, x5, [x3]
 	stp	x6, x7, [x3, #16]
@@ -225,6 +226,7 @@ __kvm_hyp_code_start:
 	stp	x18, x19, [x3, #112]
 	stp	x20, x21, [x3, #128]
 	stp	x22, x23, [x3, #144]
+	str	x24, [x3, #160]
 .endm
 
 .macro restore_sysregs
@@ -243,6 +245,7 @@ __kvm_hyp_code_start:
 	ldp	x18, x19, [x3, #112]
 	ldp	x20, x21, [x3, #128]
 	ldp	x22, x23, [x3, #144]
+	ldr	x24, [x3, #160]
 
 	msr	vmpidr_el2,	x4
 	msr	csselr_el1,	x5
@@ -264,6 +267,7 @@ __kvm_hyp_code_start:
 	msr	tpidr_el1,	x21
 	msr	amair_el1,	x22
 	msr	cntkctl_el1,	x23
+	msr	par_el1,	x24
 .endm
 
 .macro skip_32bit_state tmp, target
@@ -753,6 +757,10 @@ el1_trap:
 	 */
 	tbnz	x1, #7, 1f	// S1PTW is set
 
+	/* Preserve PAR_EL1 */
+	mrs	x3, par_el1
+	push	x3, xzr
+
 	/*
 	 * Permission fault, HPFAR_EL2 is invalid.
 	 * Resolve the IPA the hard way using the guest VA.
@@ -766,6 +774,8 @@ el1_trap:
 
 	/* Read result */
 	mrs	x3, par_el1
+	pop	x0, xzr			// Restore PAR_EL1 from the stack
+	msr	par_el1, x0
 	tbnz	x3, #0, 3f		// Bail out if we failed the translation
 	ubfx	x3, x3, #12, #36	// Extract IPA
 	lsl	x3, x3, #4		// and present it like HPFAR
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9492360..02e9d09 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -211,6 +211,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	/* FAR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
 	  NULL, reset_unknown, FAR_EL1 },
+	/* PAR_EL1 */
+	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
+	  NULL, reset_unknown, PAR_EL1 },
 
 	/* PMINTENSET_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
  2013-07-19 13:53 ` Marc Zyngier
@ 2013-07-19 13:53   ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: christoffer.dall, catalin.marinas, will.deacon

When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.

For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
before doing the TLB invalidation itself.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 218802f..e1ccfcc 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
 
 // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 ENTRY(__kvm_tlb_flush_vmid_ipa)
+	dsb	ishst	// Make sure previous writes are observable
+
 	kern_hyp_va	x0
 	ldr	x2, [x0, #KVM_VTTBR]
 	msr	vttbr_el2, x2
-- 
1.8.2.3



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
@ 2013-07-19 13:53   ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.

For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
before doing the TLB invalidation itself.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 218802f..e1ccfcc 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
 
 // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 ENTRY(__kvm_tlb_flush_vmid_ipa)
+	dsb	ishst	// Make sure previous writes are observable
+
 	kern_hyp_va	x0
 	ldr	x2, [x0, #KVM_VTTBR]
 	msr	vttbr_el2, x2
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 13:53 ` Marc Zyngier
@ 2013-07-19 13:53   ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: christoffer.dall, catalin.marinas, will.deacon

So far, when a guest executes WFE (like when waiting for a spinlock
to become unlocked), we don't do a thing and let it run uninterrupted.

Another option is to trap a blocking WFE and offer the opportunity
to the scheduler to switch to another task, potentially giving the
vcpu holding the spinlock a chance to run sooner.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h |  6 ++++--
 arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index a5f28e2..ac1ea05 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -63,6 +63,7 @@
  * TAC:		Trap ACTLR
  * TSC:		Trap SMC
  * TSW:		Trap cache operations by set/way
+ * TWE:		Trap WFE
  * TWI:		Trap WFI
  * TIDCP:	Trap L2CTLR/L2ECTLR
  * BSU_IS:	Upgrade barriers to the inner shareable domain
@@ -72,8 +73,9 @@
  * FMO:		Override CPSR.F and enable signaling with VF
  * SWIO:	Turn set/way invalidates into set/way clean+invalidate
  */
-#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
-			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
+#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
+			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
+			 HCR_AMO | HCR_IMO | HCR_FMO | \
 			 HCR_SWIO | HCR_TIDCP | HCR_RW)
 #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
 
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 9beaca03..b0098c2 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 }
 
 /**
- * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
+ * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
+ *		    instruction executed by a guest
+ *
  * @vcpu:	the vcpu pointer
  *
- * Simply call kvm_vcpu_block(), which will halt execution of
+ * WFE: Yield the CPU and come back to this vcpu when the scheduler
+ * decides to.
+ * WFI: Simply call kvm_vcpu_block(), which will halt execution of
  * world-switches and schedule other host processes until there is an
  * incoming IRQ or FIQ to the VM.
  */
-static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	kvm_vcpu_block(vcpu);
+	if (kvm_vcpu_get_hsr(vcpu) & 1)
+		cond_resched();
+	else
+		kvm_vcpu_block(vcpu);
+
 	return 1;
 }
 
 static exit_handle_fn arm_exit_handlers[] = {
-	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
+	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
 	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
 	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
 	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
-- 
1.8.2.3



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-19 13:53   ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

So far, when a guest executes WFE (like when waiting for a spinlock
to become unlocked), we don't do a thing and let it run uninterrupted.

Another option is to trap a blocking WFE and offer the opportunity
to the scheduler to switch to another task, potentially giving the
vcpu holding the spinlock a chance to run sooner.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h |  6 ++++--
 arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index a5f28e2..ac1ea05 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -63,6 +63,7 @@
  * TAC:		Trap ACTLR
  * TSC:		Trap SMC
  * TSW:		Trap cache operations by set/way
+ * TWE:		Trap WFE
  * TWI:		Trap WFI
  * TIDCP:	Trap L2CTLR/L2ECTLR
  * BSU_IS:	Upgrade barriers to the inner shareable domain
@@ -72,8 +73,9 @@
  * FMO:		Override CPSR.F and enable signaling with VF
  * SWIO:	Turn set/way invalidates into set/way clean+invalidate
  */
-#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
-			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
+#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
+			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
+			 HCR_AMO | HCR_IMO | HCR_FMO | \
 			 HCR_SWIO | HCR_TIDCP | HCR_RW)
 #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
 
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 9beaca03..b0098c2 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 }
 
 /**
- * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
+ * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
+ *		    instruction executed by a guest
+ *
  * @vcpu:	the vcpu pointer
  *
- * Simply call kvm_vcpu_block(), which will halt execution of
+ * WFE: Yield the CPU and come back to this vcpu when the scheduler
+ * decides to.
+ * WFI: Simply call kvm_vcpu_block(), which will halt execution of
  * world-switches and schedule other host processes until there is an
  * incoming IRQ or FIQ to the VM.
  */
-static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
+static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	kvm_vcpu_block(vcpu);
+	if (kvm_vcpu_get_hsr(vcpu) & 1)
+		cond_resched();
+	else
+		kvm_vcpu_block(vcpu);
+
 	return 1;
 }
 
 static exit_handle_fn arm_exit_handlers[] = {
-	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
+	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
 	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
 	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
 	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/4] arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S
  2013-07-19 13:53 ` Marc Zyngier
@ 2013-07-19 13:53   ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: christoffer.dall, catalin.marinas, will.deacon

We already have __hyp_text_{start,end} to express the boundaries
of the HYP text section, and __kvm_hyp_code_{start,end} are getting
in the way of a more modular world switch code.

Just turn __kvm_hyp_code_{start,end} into #defines mapping the
linker-emited symbols.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h | 6 ++++--
 arch/arm64/include/asm/virt.h    | 4 ++++
 arch/arm64/kvm/hyp.S             | 6 ------
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index b25763b..dddb345 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -18,6 +18,8 @@
 #ifndef __ARM_KVM_ASM_H__
 #define __ARM_KVM_ASM_H__
 
+#include <asm/virt.h>
+
 /*
  * 0 is reserved as an invalid value.
  * Order *must* be kept in sync with the hyp switch code.
@@ -95,8 +97,8 @@ extern char __kvm_hyp_init_end[];
 
 extern char __kvm_hyp_vector[];
 
-extern char __kvm_hyp_code_start[];
-extern char __kvm_hyp_code_end[];
+#define	__kvm_hyp_code_start	__hyp_text_start
+#define	__kvm_hyp_code_end	__hyp_text_end
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4398272..59d67f3 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -49,6 +49,10 @@ static inline bool is_hyp_mode_mismatched(void)
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
 }
 
+/* The section containing the hypervisor text */
+extern char __hyp_text_start[];
+extern char __hyp_text_end[];
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index e1ccfcc..19ccf48 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -36,9 +36,6 @@
 	.pushsection	.hyp.text, "ax"
 	.align	PAGE_SHIFT
 
-__kvm_hyp_code_start:
-	.globl __kvm_hyp_code_start
-
 .macro save_common_regs
 	// x2: base address for cpu context
 	// x3: tmp register
@@ -837,7 +834,4 @@ ENTRY(__kvm_hyp_vector)
 	ventry	el1_error_invalid		// Error 32-bit EL1
 ENDPROC(__kvm_hyp_vector)
 
-__kvm_hyp_code_end:
-	.globl	__kvm_hyp_code_end
-
 	.popsection
-- 
1.8.2.3



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/4] arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S
@ 2013-07-19 13:53   ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

We already have __hyp_text_{start,end} to express the boundaries
of the HYP text section, and __kvm_hyp_code_{start,end} are getting
in the way of a more modular world switch code.

Just turn __kvm_hyp_code_{start,end} into #defines mapping the
linker-emited symbols.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h | 6 ++++--
 arch/arm64/include/asm/virt.h    | 4 ++++
 arch/arm64/kvm/hyp.S             | 6 ------
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index b25763b..dddb345 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -18,6 +18,8 @@
 #ifndef __ARM_KVM_ASM_H__
 #define __ARM_KVM_ASM_H__
 
+#include <asm/virt.h>
+
 /*
  * 0 is reserved as an invalid value.
  * Order *must* be kept in sync with the hyp switch code.
@@ -95,8 +97,8 @@ extern char __kvm_hyp_init_end[];
 
 extern char __kvm_hyp_vector[];
 
-extern char __kvm_hyp_code_start[];
-extern char __kvm_hyp_code_end[];
+#define	__kvm_hyp_code_start	__hyp_text_start
+#define	__kvm_hyp_code_end	__hyp_text_end
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4398272..59d67f3 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -49,6 +49,10 @@ static inline bool is_hyp_mode_mismatched(void)
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
 }
 
+/* The section containing the hypervisor text */
+extern char __hyp_text_start[];
+extern char __hyp_text_end[];
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index e1ccfcc..19ccf48 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -36,9 +36,6 @@
 	.pushsection	.hyp.text, "ax"
 	.align	PAGE_SHIFT
 
-__kvm_hyp_code_start:
-	.globl __kvm_hyp_code_start
-
 .macro save_common_regs
 	// x2: base address for cpu context
 	// x3: tmp register
@@ -837,7 +834,4 @@ ENTRY(__kvm_hyp_vector)
 	ventry	el1_error_invalid		// Error 32-bit EL1
 ENDPROC(__kvm_hyp_vector)
 
-__kvm_hyp_code_end:
-	.globl	__kvm_hyp_code_end
-
 	.popsection
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-19 14:25     ` Will Deacon
  -1 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-19 14:25 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, Catalin Marinas

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

[...]

> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();

The hardcoded `1' doesn't make it obvious that we're treating wfi and wfe
differently. Could you use a #define for that bit in hsr.iss please?

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-19 14:25     ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-19 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

[...]

> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();

The hardcoded `1' doesn't make it obvious that we're treating wfi and wfe
differently. Could you use a #define for that bit in hsr.iss please?

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 14:25     ` Will Deacon
@ 2013-07-19 14:29       ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 14:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, Catalin Marinas

On 19/07/13 15:25, Will Deacon wrote:
> On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
>> So far, when a guest executes WFE (like when waiting for a spinlock
>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>
>> Another option is to trap a blocking WFE and offer the opportunity
>> to the scheduler to switch to another task, potentially giving the
>> vcpu holding the spinlock a chance to run sooner.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> [...]
> 
>> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	kvm_vcpu_block(vcpu);
>> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
>> +		cond_resched();
> 
> The hardcoded `1' doesn't make it obvious that we're treating wfi and wfe
> differently. Could you use a #define for that bit in hsr.iss please?

Good point. I'll rework that part.

	M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-19 14:29       ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/07/13 15:25, Will Deacon wrote:
> On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
>> So far, when a guest executes WFE (like when waiting for a spinlock
>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>
>> Another option is to trap a blocking WFE and offer the opportunity
>> to the scheduler to switch to another task, potentially giving the
>> vcpu holding the spinlock a chance to run sooner.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> [...]
> 
>> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	kvm_vcpu_block(vcpu);
>> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
>> +		cond_resched();
> 
> The hardcoded `1' doesn't make it obvious that we're treating wfi and wfe
> differently. Could you use a #define for that bit in hsr.iss please?

Good point. I'll rework that part.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-19 14:32     ` Will Deacon
  -1 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-19 14:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, Catalin Marinas

On Fri, Jul 19, 2013 at 02:53:53PM +0100, Marc Zyngier wrote:
> When performing a Stage-2 TLB invalidation, it is necessary to
> make sure the write to the page tables is observable by all CPUs.
> 
> For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
> before doing the TLB invalidation itself.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp.S | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 218802f..e1ccfcc 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
>  
>  // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  ENTRY(__kvm_tlb_flush_vmid_ipa)
> +	dsb	ishst	// Make sure previous writes are observable
> +
>  	kern_hyp_va	x0
>  	ldr	x2, [x0, #KVM_VTTBR]
>  	msr	vttbr_el2, x2
> -- 
> 1.8.2.3

I don't think the comment adds anything to this code. Also, why don't you
need similar barriers for the cases where you clobber the entire TLB (e.g.
__kvm_flush_vm_context)?

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
@ 2013-07-19 14:32     ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-19 14:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:53PM +0100, Marc Zyngier wrote:
> When performing a Stage-2 TLB invalidation, it is necessary to
> make sure the write to the page tables is observable by all CPUs.
> 
> For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
> before doing the TLB invalidation itself.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp.S | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 218802f..e1ccfcc 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
>  
>  // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  ENTRY(__kvm_tlb_flush_vmid_ipa)
> +	dsb	ishst	// Make sure previous writes are observable
> +
>  	kern_hyp_va	x0
>  	ldr	x2, [x0, #KVM_VTTBR]
>  	msr	vttbr_el2, x2
> -- 
> 1.8.2.3

I don't think the comment adds anything to this code. Also, why don't you
need similar barriers for the cases where you clobber the entire TLB (e.g.
__kvm_flush_vm_context)?

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
  2013-07-19 14:32     ` Will Deacon
@ 2013-07-19 14:53       ` Marc Zyngier
  -1 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 14:53 UTC (permalink / raw)
  To: Will Deacon
  Cc: christoffer.dall, Catalin Marinas, kvm, linux-arm-kernel, kvmarm

On 19/07/13 15:32, Will Deacon wrote:
> On Fri, Jul 19, 2013 at 02:53:53PM +0100, Marc Zyngier wrote:
>> When performing a Stage-2 TLB invalidation, it is necessary to
>> make sure the write to the page tables is observable by all CPUs.
>>
>> For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
>> before doing the TLB invalidation itself.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/hyp.S | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index 218802f..e1ccfcc 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
>>  
>>  // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  ENTRY(__kvm_tlb_flush_vmid_ipa)
>> +	dsb	ishst	// Make sure previous writes are observable
>> +
>>  	kern_hyp_va	x0
>>  	ldr	x2, [x0, #KVM_VTTBR]
>>  	msr	vttbr_el2, x2
>> -- 
>> 1.8.2.3
> 
> I don't think the comment adds anything to this code. Also, why don't you
> need similar barriers for the cases where you clobber the entire TLB (e.g.
> __kvm_flush_vm_context)?

I think they are required as well. I'll respin the patch.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
@ 2013-07-19 14:53       ` Marc Zyngier
  0 siblings, 0 replies; 42+ messages in thread
From: Marc Zyngier @ 2013-07-19 14:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/07/13 15:32, Will Deacon wrote:
> On Fri, Jul 19, 2013 at 02:53:53PM +0100, Marc Zyngier wrote:
>> When performing a Stage-2 TLB invalidation, it is necessary to
>> make sure the write to the page tables is observable by all CPUs.
>>
>> For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
>> before doing the TLB invalidation itself.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/hyp.S | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index 218802f..e1ccfcc 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -604,6 +604,8 @@ END(__kvm_vcpu_run)
>>  
>>  // void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  ENTRY(__kvm_tlb_flush_vmid_ipa)
>> +	dsb	ishst	// Make sure previous writes are observable
>> +
>>  	kern_hyp_va	x0
>>  	ldr	x2, [x0, #KVM_VTTBR]
>>  	msr	vttbr_el2, x2
>> -- 
>> 1.8.2.3
> 
> I don't think the comment adds anything to this code. Also, why don't you
> need similar barriers for the cases where you clobber the entire TLB (e.g.
> __kvm_flush_vm_context)?

I think they are required as well. I'll respin the patch.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/4] arm64: KVM: perform save/restore of PAR_EL1
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-20 21:51     ` Christoffer Dall
  -1 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-20 21:51 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, kvm, kvmarm, catalin.marinas, will.deacon

On Fri, Jul 19, 2013 at 02:53:52PM +0100, Marc Zyngier wrote:
> Not saving PAR_EL1 is an unfortunate oversight. If the guest
> performs an AT* operation and gets scheduled out before reading
> the result of the translation from PAREL1, it could become
> corrupted by another guest or the host.
> 
> Saving this register is made slightly more complicated as KVM also
> uses it on the permission fault handling path, leading to an ugly
> "stash and restore" sequence. Fortunately, this is already a slow
> path so we don't really care. Also, Linux doesn't do any AT*
> operation, so Linux guests are not impacted by this bug.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h | 17 ++++++++++-------
>  arch/arm64/kvm/hyp.S             | 10 ++++++++++
>  arch/arm64/kvm/sys_regs.c        |  3 +++
>  3 files changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index c92de41..b25763b 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -42,14 +42,15 @@
>  #define	TPIDR_EL1	18	/* Thread ID, Privileged */
>  #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
>  #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
> +#define	PAR_EL1		21	/* Physical Address Register */
>  /* 32bit specific registers. Keep them at the end of the range */
> -#define	DACR32_EL2	21	/* Domain Access Control Register */
> -#define	IFSR32_EL2	22	/* Instruction Fault Status Register */
> -#define	FPEXC32_EL2	23	/* Floating-Point Exception Control Register */
> -#define	DBGVCR32_EL2	24	/* Debug Vector Catch Register */
> -#define	TEECR32_EL1	25	/* ThumbEE Configuration Register */
> -#define	TEEHBR32_EL1	26	/* ThumbEE Handler Base Register */
> -#define	NR_SYS_REGS	27
> +#define	DACR32_EL2	22	/* Domain Access Control Register */
> +#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
> +#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
> +#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
> +#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
> +#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
> +#define	NR_SYS_REGS	28
>  
>  /* 32bit mapping */
>  #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
> @@ -69,6 +70,8 @@
>  #define c5_AIFSR	(AFSR1_EL1 * 2)	/* Auxiliary Instr Fault Status R */
>  #define c6_DFAR		(FAR_EL1 * 2)	/* Data Fault Address Register */
>  #define c6_IFAR		(c6_DFAR + 1)	/* Instruction Fault Address Register */
> +#define c7_PAR		(PAR_EL1 * 2)	/* Physical Address Register */
> +#define c7_PAR_high	(c7_PAR + 1)	/* PAR top 32 bits */
>  #define c10_PRRR	(MAIR_EL1 * 2)	/* Primary Region Remap Register */
>  #define c10_NMRR	(c10_PRRR + 1)	/* Normal Memory Remap Register */
>  #define c12_VBAR	(VBAR_EL1 * 2)	/* Vector Base Address Register */
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index ff985e3..218802f 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -214,6 +214,7 @@ __kvm_hyp_code_start:
>  	mrs	x21,	tpidr_el1
>  	mrs	x22, 	amair_el1
>  	mrs	x23, 	cntkctl_el1
> +	mrs	x24,	par_el1
>  
>  	stp	x4, x5, [x3]
>  	stp	x6, x7, [x3, #16]
> @@ -225,6 +226,7 @@ __kvm_hyp_code_start:
>  	stp	x18, x19, [x3, #112]
>  	stp	x20, x21, [x3, #128]
>  	stp	x22, x23, [x3, #144]
> +	str	x24, [x3, #160]
>  .endm
>  
>  .macro restore_sysregs
> @@ -243,6 +245,7 @@ __kvm_hyp_code_start:
>  	ldp	x18, x19, [x3, #112]
>  	ldp	x20, x21, [x3, #128]
>  	ldp	x22, x23, [x3, #144]
> +	ldr	x24, [x3, #160]
>  
>  	msr	vmpidr_el2,	x4
>  	msr	csselr_el1,	x5
> @@ -264,6 +267,7 @@ __kvm_hyp_code_start:
>  	msr	tpidr_el1,	x21
>  	msr	amair_el1,	x22
>  	msr	cntkctl_el1,	x23
> +	msr	par_el1,	x24
>  .endm
>  
>  .macro skip_32bit_state tmp, target
> @@ -753,6 +757,10 @@ el1_trap:
>  	 */
>  	tbnz	x1, #7, 1f	// S1PTW is set
>  
> +	/* Preserve PAR_EL1 */
> +	mrs	x3, par_el1
> +	push	x3, xzr
> +
>  	/*
>  	 * Permission fault, HPFAR_EL2 is invalid.
>  	 * Resolve the IPA the hard way using the guest VA.
> @@ -766,6 +774,8 @@ el1_trap:
>  
>  	/* Read result */
>  	mrs	x3, par_el1
> +	pop	x0, xzr			// Restore PAR_EL1 from the stack
> +	msr	par_el1, x0
>  	tbnz	x3, #0, 3f		// Bail out if we failed the translation
>  	ubfx	x3, x3, #12, #36	// Extract IPA
>  	lsl	x3, x3, #4		// and present it like HPFAR
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 9492360..02e9d09 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -211,6 +211,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	/* FAR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>  	  NULL, reset_unknown, FAR_EL1 },
> +	/* PAR_EL1 */
> +	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
> +	  NULL, reset_unknown, PAR_EL1 },
>  
>  	/* PMINTENSET_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
> -- 
> 1.8.2.3
> 
> 

Looks good to me,
-Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/4] arm64: KVM: perform save/restore of PAR_EL1
@ 2013-07-20 21:51     ` Christoffer Dall
  0 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-20 21:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:52PM +0100, Marc Zyngier wrote:
> Not saving PAR_EL1 is an unfortunate oversight. If the guest
> performs an AT* operation and gets scheduled out before reading
> the result of the translation from PAREL1, it could become
> corrupted by another guest or the host.
> 
> Saving this register is made slightly more complicated as KVM also
> uses it on the permission fault handling path, leading to an ugly
> "stash and restore" sequence. Fortunately, this is already a slow
> path so we don't really care. Also, Linux doesn't do any AT*
> operation, so Linux guests are not impacted by this bug.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h | 17 ++++++++++-------
>  arch/arm64/kvm/hyp.S             | 10 ++++++++++
>  arch/arm64/kvm/sys_regs.c        |  3 +++
>  3 files changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index c92de41..b25763b 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -42,14 +42,15 @@
>  #define	TPIDR_EL1	18	/* Thread ID, Privileged */
>  #define	AMAIR_EL1	19	/* Aux Memory Attribute Indirection Register */
>  #define	CNTKCTL_EL1	20	/* Timer Control Register (EL1) */
> +#define	PAR_EL1		21	/* Physical Address Register */
>  /* 32bit specific registers. Keep them at the end of the range */
> -#define	DACR32_EL2	21	/* Domain Access Control Register */
> -#define	IFSR32_EL2	22	/* Instruction Fault Status Register */
> -#define	FPEXC32_EL2	23	/* Floating-Point Exception Control Register */
> -#define	DBGVCR32_EL2	24	/* Debug Vector Catch Register */
> -#define	TEECR32_EL1	25	/* ThumbEE Configuration Register */
> -#define	TEEHBR32_EL1	26	/* ThumbEE Handler Base Register */
> -#define	NR_SYS_REGS	27
> +#define	DACR32_EL2	22	/* Domain Access Control Register */
> +#define	IFSR32_EL2	23	/* Instruction Fault Status Register */
> +#define	FPEXC32_EL2	24	/* Floating-Point Exception Control Register */
> +#define	DBGVCR32_EL2	25	/* Debug Vector Catch Register */
> +#define	TEECR32_EL1	26	/* ThumbEE Configuration Register */
> +#define	TEEHBR32_EL1	27	/* ThumbEE Handler Base Register */
> +#define	NR_SYS_REGS	28
>  
>  /* 32bit mapping */
>  #define c0_MPIDR	(MPIDR_EL1 * 2)	/* MultiProcessor ID Register */
> @@ -69,6 +70,8 @@
>  #define c5_AIFSR	(AFSR1_EL1 * 2)	/* Auxiliary Instr Fault Status R */
>  #define c6_DFAR		(FAR_EL1 * 2)	/* Data Fault Address Register */
>  #define c6_IFAR		(c6_DFAR + 1)	/* Instruction Fault Address Register */
> +#define c7_PAR		(PAR_EL1 * 2)	/* Physical Address Register */
> +#define c7_PAR_high	(c7_PAR + 1)	/* PAR top 32 bits */
>  #define c10_PRRR	(MAIR_EL1 * 2)	/* Primary Region Remap Register */
>  #define c10_NMRR	(c10_PRRR + 1)	/* Normal Memory Remap Register */
>  #define c12_VBAR	(VBAR_EL1 * 2)	/* Vector Base Address Register */
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index ff985e3..218802f 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -214,6 +214,7 @@ __kvm_hyp_code_start:
>  	mrs	x21,	tpidr_el1
>  	mrs	x22, 	amair_el1
>  	mrs	x23, 	cntkctl_el1
> +	mrs	x24,	par_el1
>  
>  	stp	x4, x5, [x3]
>  	stp	x6, x7, [x3, #16]
> @@ -225,6 +226,7 @@ __kvm_hyp_code_start:
>  	stp	x18, x19, [x3, #112]
>  	stp	x20, x21, [x3, #128]
>  	stp	x22, x23, [x3, #144]
> +	str	x24, [x3, #160]
>  .endm
>  
>  .macro restore_sysregs
> @@ -243,6 +245,7 @@ __kvm_hyp_code_start:
>  	ldp	x18, x19, [x3, #112]
>  	ldp	x20, x21, [x3, #128]
>  	ldp	x22, x23, [x3, #144]
> +	ldr	x24, [x3, #160]
>  
>  	msr	vmpidr_el2,	x4
>  	msr	csselr_el1,	x5
> @@ -264,6 +267,7 @@ __kvm_hyp_code_start:
>  	msr	tpidr_el1,	x21
>  	msr	amair_el1,	x22
>  	msr	cntkctl_el1,	x23
> +	msr	par_el1,	x24
>  .endm
>  
>  .macro skip_32bit_state tmp, target
> @@ -753,6 +757,10 @@ el1_trap:
>  	 */
>  	tbnz	x1, #7, 1f	// S1PTW is set
>  
> +	/* Preserve PAR_EL1 */
> +	mrs	x3, par_el1
> +	push	x3, xzr
> +
>  	/*
>  	 * Permission fault, HPFAR_EL2 is invalid.
>  	 * Resolve the IPA the hard way using the guest VA.
> @@ -766,6 +774,8 @@ el1_trap:
>  
>  	/* Read result */
>  	mrs	x3, par_el1
> +	pop	x0, xzr			// Restore PAR_EL1 from the stack
> +	msr	par_el1, x0
>  	tbnz	x3, #0, 3f		// Bail out if we failed the translation
>  	ubfx	x3, x3, #12, #36	// Extract IPA
>  	lsl	x3, x3, #4		// and present it like HPFAR
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 9492360..02e9d09 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -211,6 +211,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	/* FAR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>  	  NULL, reset_unknown, FAR_EL1 },
> +	/* PAR_EL1 */
> +	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
> +	  NULL, reset_unknown, PAR_EL1 },
>  
>  	/* PMINTENSET_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001),
> -- 
> 1.8.2.3
> 
> 

Looks good to me,
-Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-20 22:04     ` Christoffer Dall
  -1 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-20 22:04 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, kvm, kvmarm, catalin.marinas, will.deacon

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 

I'm curious if we have any data supporting this to be a good idea?

My intuition here is that waiting for a spinlock really shouldn't be
something a guest is doing for a long time - we always try to avoid too
much contention on spinlocks, no?  The theory that it would unlock the
spinlock sooner is really only supported if the CPU resources are
grossly oversubscribed - are we optimizing for this case?

So, how many cycles do we anticipate a world-switch back and forward
between a VM and the host to be compared to the average number of spin
cycles for a spinlock?

Finally, for the case where a waiting vcpu is only going to spin for a
couple of cycles, aren't we adding significant overhead?  I would expect
this to be the most common case.


> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  6 ++++--
>  arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index a5f28e2..ac1ea05 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -63,6 +63,7 @@
>   * TAC:		Trap ACTLR
>   * TSC:		Trap SMC
>   * TSW:		Trap cache operations by set/way
> + * TWE:		Trap WFE
>   * TWI:		Trap WFI
>   * TIDCP:	Trap L2CTLR/L2ECTLR
>   * BSU_IS:	Upgrade barriers to the inner shareable domain
> @@ -72,8 +73,9 @@
>   * FMO:		Override CPSR.F and enable signaling with VF
>   * SWIO:	Turn set/way invalidates into set/way clean+invalidate
>   */
> -#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
> -			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
> +#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
> +			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
> +			 HCR_AMO | HCR_IMO | HCR_FMO | \
>  			 HCR_SWIO | HCR_TIDCP | HCR_RW)
>  #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
>  
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 9beaca03..b0098c2 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  }
>  
>  /**
> - * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
> + * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
> + *		    instruction executed by a guest
> + *
>   * @vcpu:	the vcpu pointer
>   *
> - * Simply call kvm_vcpu_block(), which will halt execution of
> + * WFE: Yield the CPU and come back to this vcpu when the scheduler
> + * decides to.
> + * WFI: Simply call kvm_vcpu_block(), which will halt execution of
>   * world-switches and schedule other host processes until there is an
>   * incoming IRQ or FIQ to the VM.
>   */
> -static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();
> +	else
> +		kvm_vcpu_block(vcpu);
> +
>  	return 1;
>  }
>  
>  static exit_handle_fn arm_exit_handlers[] = {
> -	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
> +	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
>  	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
>  	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
> -- 
> 1.8.2.3
> 
> 

-- 
Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-20 22:04     ` Christoffer Dall
  0 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-20 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 

I'm curious if we have any data supporting this to be a good idea?

My intuition here is that waiting for a spinlock really shouldn't be
something a guest is doing for a long time - we always try to avoid too
much contention on spinlocks, no?  The theory that it would unlock the
spinlock sooner is really only supported if the CPU resources are
grossly oversubscribed - are we optimizing for this case?

So, how many cycles do we anticipate a world-switch back and forward
between a VM and the host to be compared to the average number of spin
cycles for a spinlock?

Finally, for the case where a waiting vcpu is only going to spin for a
couple of cycles, aren't we adding significant overhead?  I would expect
this to be the most common case.


> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  6 ++++--
>  arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index a5f28e2..ac1ea05 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -63,6 +63,7 @@
>   * TAC:		Trap ACTLR
>   * TSC:		Trap SMC
>   * TSW:		Trap cache operations by set/way
> + * TWE:		Trap WFE
>   * TWI:		Trap WFI
>   * TIDCP:	Trap L2CTLR/L2ECTLR
>   * BSU_IS:	Upgrade barriers to the inner shareable domain
> @@ -72,8 +73,9 @@
>   * FMO:		Override CPSR.F and enable signaling with VF
>   * SWIO:	Turn set/way invalidates into set/way clean+invalidate
>   */
> -#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
> -			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
> +#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
> +			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
> +			 HCR_AMO | HCR_IMO | HCR_FMO | \
>  			 HCR_SWIO | HCR_TIDCP | HCR_RW)
>  #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
>  
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 9beaca03..b0098c2 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  }
>  
>  /**
> - * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
> + * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
> + *		    instruction executed by a guest
> + *
>   * @vcpu:	the vcpu pointer
>   *
> - * Simply call kvm_vcpu_block(), which will halt execution of
> + * WFE: Yield the CPU and come back to this vcpu when the scheduler
> + * decides to.
> + * WFI: Simply call kvm_vcpu_block(), which will halt execution of
>   * world-switches and schedule other host processes until there is an
>   * incoming IRQ or FIQ to the VM.
>   */
> -static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();
> +	else
> +		kvm_vcpu_block(vcpu);
> +
>  	return 1;
>  }
>  
>  static exit_handle_fn arm_exit_handlers[] = {
> -	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
> +	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
>  	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
>  	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
> -- 
> 1.8.2.3
> 
> 

-- 
Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-22  7:36     ` Gleb Natapov
  -1 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2013-07-22  7:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, catalin.marinas,
	will.deacon

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 
Why is this targeted at 3.11 as opposite to 3.12? Looks like an
optimization.

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  6 ++++--
>  arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index a5f28e2..ac1ea05 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -63,6 +63,7 @@
>   * TAC:		Trap ACTLR
>   * TSC:		Trap SMC
>   * TSW:		Trap cache operations by set/way
> + * TWE:		Trap WFE
>   * TWI:		Trap WFI
>   * TIDCP:	Trap L2CTLR/L2ECTLR
>   * BSU_IS:	Upgrade barriers to the inner shareable domain
> @@ -72,8 +73,9 @@
>   * FMO:		Override CPSR.F and enable signaling with VF
>   * SWIO:	Turn set/way invalidates into set/way clean+invalidate
>   */
> -#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
> -			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
> +#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
> +			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
> +			 HCR_AMO | HCR_IMO | HCR_FMO | \
>  			 HCR_SWIO | HCR_TIDCP | HCR_RW)
>  #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
>  
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 9beaca03..b0098c2 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  }
>  
>  /**
> - * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
> + * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
> + *		    instruction executed by a guest
> + *
>   * @vcpu:	the vcpu pointer
>   *
> - * Simply call kvm_vcpu_block(), which will halt execution of
> + * WFE: Yield the CPU and come back to this vcpu when the scheduler
> + * decides to.
> + * WFI: Simply call kvm_vcpu_block(), which will halt execution of
>   * world-switches and schedule other host processes until there is an
>   * incoming IRQ or FIQ to the VM.
>   */
> -static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();
> +	else
> +		kvm_vcpu_block(vcpu);
> +
>  	return 1;
>  }
>  
>  static exit_handle_fn arm_exit_handlers[] = {
> -	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
> +	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
>  	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
>  	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
> -- 
> 1.8.2.3
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-22  7:36     ` Gleb Natapov
  0 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2013-07-22  7:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:54PM +0100, Marc Zyngier wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
> 
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
> 
Why is this targeted at 3.11 as opposite to 3.12? Looks like an
optimization.

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  6 ++++--
>  arch/arm64/kvm/handle_exit.c     | 18 +++++++++++++-----
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index a5f28e2..ac1ea05 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -63,6 +63,7 @@
>   * TAC:		Trap ACTLR
>   * TSC:		Trap SMC
>   * TSW:		Trap cache operations by set/way
> + * TWE:		Trap WFE
>   * TWI:		Trap WFI
>   * TIDCP:	Trap L2CTLR/L2ECTLR
>   * BSU_IS:	Upgrade barriers to the inner shareable domain
> @@ -72,8 +73,9 @@
>   * FMO:		Override CPSR.F and enable signaling with VF
>   * SWIO:	Turn set/way invalidates into set/way clean+invalidate
>   */
> -#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
> -			 HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
> +#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
> +			 HCR_BSU_IS | HCR_FB | HCR_TAC | \
> +			 HCR_AMO | HCR_IMO | HCR_FMO | \
>  			 HCR_SWIO | HCR_TIDCP | HCR_RW)
>  #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
>  
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 9beaca03..b0098c2 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -47,21 +47,29 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  }
>  
>  /**
> - * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
> + * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
> + *		    instruction executed by a guest
> + *
>   * @vcpu:	the vcpu pointer
>   *
> - * Simply call kvm_vcpu_block(), which will halt execution of
> + * WFE: Yield the CPU and come back to this vcpu when the scheduler
> + * decides to.
> + * WFI: Simply call kvm_vcpu_block(), which will halt execution of
>   * world-switches and schedule other host processes until there is an
>   * incoming IRQ or FIQ to the VM.
>   */
> -static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> -	kvm_vcpu_block(vcpu);
> +	if (kvm_vcpu_get_hsr(vcpu) & 1)
> +		cond_resched();
> +	else
> +		kvm_vcpu_block(vcpu);
> +
>  	return 1;
>  }
>  
>  static exit_handle_fn arm_exit_handlers[] = {
> -	[ESR_EL2_EC_WFI]	= kvm_handle_wfi,
> +	[ESR_EL2_EC_WFI]	= kvm_handle_wfx,
>  	[ESR_EL2_EC_CP15_32]	= kvm_handle_cp15_32,
>  	[ESR_EL2_EC_CP15_64]	= kvm_handle_cp15_64,
>  	[ESR_EL2_EC_CP14_MR]	= kvm_handle_cp14_access,
> -- 
> 1.8.2.3
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 4/4] arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-22  7:36     ` Gleb Natapov
  -1 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2013-07-22  7:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, catalin.marinas,
	will.deacon

On Fri, Jul 19, 2013 at 02:53:55PM +0100, Marc Zyngier wrote:
> We already have __hyp_text_{start,end} to express the boundaries
> of the HYP text section, and __kvm_hyp_code_{start,end} are getting
> in the way of a more modular world switch code.
> 
> Just turn __kvm_hyp_code_{start,end} into #defines mapping the
> linker-emited symbols.
> 
Same here, why 3.11?

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h | 6 ++++--
>  arch/arm64/include/asm/virt.h    | 4 ++++
>  arch/arm64/kvm/hyp.S             | 6 ------
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index b25763b..dddb345 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -18,6 +18,8 @@
>  #ifndef __ARM_KVM_ASM_H__
>  #define __ARM_KVM_ASM_H__
>  
> +#include <asm/virt.h>
> +
>  /*
>   * 0 is reserved as an invalid value.
>   * Order *must* be kept in sync with the hyp switch code.
> @@ -95,8 +97,8 @@ extern char __kvm_hyp_init_end[];
>  
>  extern char __kvm_hyp_vector[];
>  
> -extern char __kvm_hyp_code_start[];
> -extern char __kvm_hyp_code_end[];
> +#define	__kvm_hyp_code_start	__hyp_text_start
> +#define	__kvm_hyp_code_end	__hyp_text_end
>  
>  extern void __kvm_flush_vm_context(void);
>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 4398272..59d67f3 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -49,6 +49,10 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> +/* The section containing the hypervisor text */
> +extern char __hyp_text_start[];
> +extern char __hyp_text_end[];
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index e1ccfcc..19ccf48 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -36,9 +36,6 @@
>  	.pushsection	.hyp.text, "ax"
>  	.align	PAGE_SHIFT
>  
> -__kvm_hyp_code_start:
> -	.globl __kvm_hyp_code_start
> -
>  .macro save_common_regs
>  	// x2: base address for cpu context
>  	// x3: tmp register
> @@ -837,7 +834,4 @@ ENTRY(__kvm_hyp_vector)
>  	ventry	el1_error_invalid		// Error 32-bit EL1
>  ENDPROC(__kvm_hyp_vector)
>  
> -__kvm_hyp_code_end:
> -	.globl	__kvm_hyp_code_end
> -
>  	.popsection
> -- 
> 1.8.2.3
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 4/4] arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S
@ 2013-07-22  7:36     ` Gleb Natapov
  0 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2013-07-22  7:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 02:53:55PM +0100, Marc Zyngier wrote:
> We already have __hyp_text_{start,end} to express the boundaries
> of the HYP text section, and __kvm_hyp_code_{start,end} are getting
> in the way of a more modular world switch code.
> 
> Just turn __kvm_hyp_code_{start,end} into #defines mapping the
> linker-emited symbols.
> 
Same here, why 3.11?

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h | 6 ++++--
>  arch/arm64/include/asm/virt.h    | 4 ++++
>  arch/arm64/kvm/hyp.S             | 6 ------
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index b25763b..dddb345 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -18,6 +18,8 @@
>  #ifndef __ARM_KVM_ASM_H__
>  #define __ARM_KVM_ASM_H__
>  
> +#include <asm/virt.h>
> +
>  /*
>   * 0 is reserved as an invalid value.
>   * Order *must* be kept in sync with the hyp switch code.
> @@ -95,8 +97,8 @@ extern char __kvm_hyp_init_end[];
>  
>  extern char __kvm_hyp_vector[];
>  
> -extern char __kvm_hyp_code_start[];
> -extern char __kvm_hyp_code_end[];
> +#define	__kvm_hyp_code_start	__hyp_text_start
> +#define	__kvm_hyp_code_end	__hyp_text_end
>  
>  extern void __kvm_flush_vm_context(void);
>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 4398272..59d67f3 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -49,6 +49,10 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> +/* The section containing the hypervisor text */
> +extern char __hyp_text_start[];
> +extern char __hyp_text_end[];
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index e1ccfcc..19ccf48 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -36,9 +36,6 @@
>  	.pushsection	.hyp.text, "ax"
>  	.align	PAGE_SHIFT
>  
> -__kvm_hyp_code_start:
> -	.globl __kvm_hyp_code_start
> -
>  .macro save_common_regs
>  	// x2: base address for cpu context
>  	// x3: tmp register
> @@ -837,7 +834,4 @@ ENTRY(__kvm_hyp_vector)
>  	ventry	el1_error_invalid		// Error 32-bit EL1
>  ENDPROC(__kvm_hyp_vector)
>  
> -__kvm_hyp_code_end:
> -	.globl	__kvm_hyp_code_end
> -
>  	.popsection
> -- 
> 1.8.2.3
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-19 13:53   ` Marc Zyngier
@ 2013-07-22  8:53     ` Raghavendra KT
  -1 siblings, 0 replies; 42+ messages in thread
From: Raghavendra KT @ 2013-07-22  8:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, christoffer.dall, catalin.marinas,
	will.deacon, Raghavendra KT

On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
>
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
>

Idea looks to be correct from my experiments on x86. It does bring some
percentage of benefits in overcommitted guests. Infact,

https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
(this results in using ple handler heuristics in vcpu_block pach).

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-22  8:53     ` Raghavendra KT
  0 siblings, 0 replies; 42+ messages in thread
From: Raghavendra KT @ 2013-07-22  8:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> So far, when a guest executes WFE (like when waiting for a spinlock
> to become unlocked), we don't do a thing and let it run uninterrupted.
>
> Another option is to trap a blocking WFE and offer the opportunity
> to the scheduler to switch to another task, potentially giving the
> vcpu holding the spinlock a chance to run sooner.
>

Idea looks to be correct from my experiments on x86. It does bring some
percentage of benefits in overcommitted guests. Infact,

https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
(this results in using ple handler heuristics in vcpu_block pach).

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-22  8:53     ` Raghavendra KT
@ 2013-07-22 12:51       ` Christoffer Dall
  -1 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-22 12:51 UTC (permalink / raw)
  To: Raghavendra KT
  Cc: Marc Zyngier, linux-arm-kernel, kvm, kvmarm, Catalin Marinas,
	Will Deacon, Raghavendra KT

On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> So far, when a guest executes WFE (like when waiting for a spinlock
>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>
>> Another option is to trap a blocking WFE and offer the opportunity
>> to the scheduler to switch to another task, potentially giving the
>> vcpu holding the spinlock a chance to run sooner.
>>
>
> Idea looks to be correct from my experiments on x86. It does bring some
> percentage of benefits in overcommitted guests. Infact,
>
> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> (this results in using ple handler heuristics in vcpu_block pach).

What about the adverse effect in the non-overcommitted case?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-22 12:51       ` Christoffer Dall
  0 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-22 12:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> So far, when a guest executes WFE (like when waiting for a spinlock
>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>
>> Another option is to trap a blocking WFE and offer the opportunity
>> to the scheduler to switch to another task, potentially giving the
>> vcpu holding the spinlock a chance to run sooner.
>>
>
> Idea looks to be correct from my experiments on x86. It does bring some
> percentage of benefits in overcommitted guests. Infact,
>
> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> (this results in using ple handler heuristics in vcpu_block pach).

What about the adverse effect in the non-overcommitted case?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-22 12:51       ` Christoffer Dall
@ 2013-07-22 13:01         ` Will Deacon
  -1 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-22 13:01 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, Marc Zyngier, Raghavendra KT, Raghavendra KT,
	Catalin Marinas, kvmarm, linux-arm-kernel

On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> So far, when a guest executes WFE (like when waiting for a spinlock
> >> to become unlocked), we don't do a thing and let it run uninterrupted.
> >>
> >> Another option is to trap a blocking WFE and offer the opportunity
> >> to the scheduler to switch to another task, potentially giving the
> >> vcpu holding the spinlock a chance to run sooner.
> >>
> >
> > Idea looks to be correct from my experiments on x86. It does bring some
> > percentage of benefits in overcommitted guests. Infact,
> >
> > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > (this results in using ple handler heuristics in vcpu_block pach).
> 
> What about the adverse effect in the non-overcommitted case?

You might hope to reduce overall system latency, in the same way that
PREEMPT kernels implement spin_lock in terms of spin_trylock. Marc might
have some ideas about the practical performance of this, but he's currently
on holiday and (assumedly) not reading the list!

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-22 13:01         ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-22 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> So far, when a guest executes WFE (like when waiting for a spinlock
> >> to become unlocked), we don't do a thing and let it run uninterrupted.
> >>
> >> Another option is to trap a blocking WFE and offer the opportunity
> >> to the scheduler to switch to another task, potentially giving the
> >> vcpu holding the spinlock a chance to run sooner.
> >>
> >
> > Idea looks to be correct from my experiments on x86. It does bring some
> > percentage of benefits in overcommitted guests. Infact,
> >
> > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > (this results in using ple handler heuristics in vcpu_block pach).
> 
> What about the adverse effect in the non-overcommitted case?

You might hope to reduce overall system latency, in the same way that
PREEMPT kernels implement spin_lock in terms of spin_trylock. Marc might
have some ideas about the practical performance of this, but he's currently
on holiday and (assumedly) not reading the list!

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-22 12:51       ` Christoffer Dall
@ 2013-07-22 13:57         ` Raghavendra K T
  -1 siblings, 0 replies; 42+ messages in thread
From: Raghavendra K T @ 2013-07-22 13:57 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Raghavendra KT, Marc Zyngier, linux-arm-kernel, kvm, kvmarm,
	Catalin Marinas, Will Deacon

On 07/22/2013 06:21 PM, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
>> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> So far, when a guest executes WFE (like when waiting for a spinlock
>>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>>
>>> Another option is to trap a blocking WFE and offer the opportunity
>>> to the scheduler to switch to another task, potentially giving the
>>> vcpu holding the spinlock a chance to run sooner.
>>>
>>
>> Idea looks to be correct from my experiments on x86. It does bring some
>> percentage of benefits in overcommitted guests. Infact,
>>
>> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
>> (this results in using ple handler heuristics in vcpu_block pach).
>
> What about the adverse effect in the non-overcommitted case?
>

Ideally is should fail to schedule any other task and comeback to halt
loop. This should not hurt AFAICS. But I agree that, numbers needed to
support this argument.

For x86, I had seen no side effects with the experiments.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-22 13:57         ` Raghavendra K T
  0 siblings, 0 replies; 42+ messages in thread
From: Raghavendra K T @ 2013-07-22 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/22/2013 06:21 PM, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
>> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> So far, when a guest executes WFE (like when waiting for a spinlock
>>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>>
>>> Another option is to trap a blocking WFE and offer the opportunity
>>> to the scheduler to switch to another task, potentially giving the
>>> vcpu holding the spinlock a chance to run sooner.
>>>
>>
>> Idea looks to be correct from my experiments on x86. It does bring some
>> percentage of benefits in overcommitted guests. Infact,
>>
>> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
>> (this results in using ple handler heuristics in vcpu_block pach).
>
> What about the adverse effect in the non-overcommitted case?
>

Ideally is should fail to schedule any other task and comeback to halt
loop. This should not hurt AFAICS. But I agree that, numbers needed to
support this argument.

For x86, I had seen no side effects with the experiments.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-22 12:51       ` Christoffer Dall
@ 2013-07-23 10:41         ` Catalin Marinas
  -1 siblings, 0 replies; 42+ messages in thread
From: Catalin Marinas @ 2013-07-23 10:41 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Raghavendra KT, Marc Zyngier, linux-arm-kernel, kvm, kvmarm,
	Will Deacon, Raghavendra KT

On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> So far, when a guest executes WFE (like when waiting for a spinlock
> >> to become unlocked), we don't do a thing and let it run uninterrupted.
> >>
> >> Another option is to trap a blocking WFE and offer the opportunity
> >> to the scheduler to switch to another task, potentially giving the
> >> vcpu holding the spinlock a chance to run sooner.
> >>
> >
> > Idea looks to be correct from my experiments on x86. It does bring some
> > percentage of benefits in overcommitted guests. Infact,
> >
> > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > (this results in using ple handler heuristics in vcpu_block pach).
> 
> What about the adverse effect in the non-overcommitted case?

Could we not detect overcommitment and only set the TWE bit in this case
(per VCPU scheduled to run)?

The advantage of a properly para-virtualised lock is that you can target
which VCPU to wake up. I don't think we can do this currently (without
assumptions about the underlying OS and how the compiler allocates
registers in the ticket spinlocks).

There have been suggestions to use WFE in user-space for a more power
efficient busy loop (usually in user-only locking, I think some
PostreSQL does that) together with an automatic even stream for waking
up the WFE (Sudeep posted some patches recently for enabling 100us event
stream). If such feature gets used, we have a new meaning for WFE and we
may not want to trap it all the time.

Question for Will - do we have a PMU counter for WFE? (don't ask why ;))

-- 
Catalin

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-23 10:41         ` Catalin Marinas
  0 siblings, 0 replies; 42+ messages in thread
From: Catalin Marinas @ 2013-07-23 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> So far, when a guest executes WFE (like when waiting for a spinlock
> >> to become unlocked), we don't do a thing and let it run uninterrupted.
> >>
> >> Another option is to trap a blocking WFE and offer the opportunity
> >> to the scheduler to switch to another task, potentially giving the
> >> vcpu holding the spinlock a chance to run sooner.
> >>
> >
> > Idea looks to be correct from my experiments on x86. It does bring some
> > percentage of benefits in overcommitted guests. Infact,
> >
> > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > (this results in using ple handler heuristics in vcpu_block pach).
> 
> What about the adverse effect in the non-overcommitted case?

Could we not detect overcommitment and only set the TWE bit in this case
(per VCPU scheduled to run)?

The advantage of a properly para-virtualised lock is that you can target
which VCPU to wake up. I don't think we can do this currently (without
assumptions about the underlying OS and how the compiler allocates
registers in the ticket spinlocks).

There have been suggestions to use WFE in user-space for a more power
efficient busy loop (usually in user-only locking, I think some
PostreSQL does that) together with an automatic even stream for waking
up the WFE (Sudeep posted some patches recently for enabling 100us event
stream). If such feature gets used, we have a new meaning for WFE and we
may not want to trap it all the time.

Question for Will - do we have a PMU counter for WFE? (don't ask why ;))

-- 
Catalin

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-23 10:41         ` Catalin Marinas
@ 2013-07-23 16:04           ` Will Deacon
  -1 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-23 16:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: kvm, Marc Zyngier, Raghavendra KT, Raghavendra KT,
	Christoffer Dall, kvmarm, linux-arm-kernel

On Tue, Jul 23, 2013 at 11:41:14AM +0100, Catalin Marinas wrote:
> On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> > On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >> So far, when a guest executes WFE (like when waiting for a spinlock
> > >> to become unlocked), we don't do a thing and let it run uninterrupted.
> > >>
> > >> Another option is to trap a blocking WFE and offer the opportunity
> > >> to the scheduler to switch to another task, potentially giving the
> > >> vcpu holding the spinlock a chance to run sooner.
> > >>
> > >
> > > Idea looks to be correct from my experiments on x86. It does bring some
> > > percentage of benefits in overcommitted guests. Infact,
> > >
> > > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > > (this results in using ple handler heuristics in vcpu_block pach).
> > 
> > What about the adverse effect in the non-overcommitted case?
> 
> Could we not detect overcommitment and only set the TWE bit in this case
> (per VCPU scheduled to run)?
> 
> The advantage of a properly para-virtualised lock is that you can target
> which VCPU to wake up. I don't think we can do this currently (without
> assumptions about the underlying OS and how the compiler allocates
> registers in the ticket spinlocks).
> 
> There have been suggestions to use WFE in user-space for a more power
> efficient busy loop (usually in user-only locking, I think some
> PostreSQL does that) together with an automatic even stream for waking
> up the WFE (Sudeep posted some patches recently for enabling 100us event
> stream). If such feature gets used, we have a new meaning for WFE and we
> may not want to trap it all the time.
> 
> Question for Will - do we have a PMU counter for WFE? (don't ask why ;))

Step away from the CPU! There's nothing architected for counting wfe
instructions so, although a CPU might expose such an event, software can't
rely on it.

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-23 16:04           ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2013-07-23 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 23, 2013 at 11:41:14AM +0100, Catalin Marinas wrote:
> On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> > On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> > > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >> So far, when a guest executes WFE (like when waiting for a spinlock
> > >> to become unlocked), we don't do a thing and let it run uninterrupted.
> > >>
> > >> Another option is to trap a blocking WFE and offer the opportunity
> > >> to the scheduler to switch to another task, potentially giving the
> > >> vcpu holding the spinlock a chance to run sooner.
> > >>
> > >
> > > Idea looks to be correct from my experiments on x86. It does bring some
> > > percentage of benefits in overcommitted guests. Infact,
> > >
> > > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > > (this results in using ple handler heuristics in vcpu_block pach).
> > 
> > What about the adverse effect in the non-overcommitted case?
> 
> Could we not detect overcommitment and only set the TWE bit in this case
> (per VCPU scheduled to run)?
> 
> The advantage of a properly para-virtualised lock is that you can target
> which VCPU to wake up. I don't think we can do this currently (without
> assumptions about the underlying OS and how the compiler allocates
> registers in the ticket spinlocks).
> 
> There have been suggestions to use WFE in user-space for a more power
> efficient busy loop (usually in user-only locking, I think some
> PostreSQL does that) together with an automatic even stream for waking
> up the WFE (Sudeep posted some patches recently for enabling 100us event
> stream). If such feature gets used, we have a new meaning for WFE and we
> may not want to trap it all the time.
> 
> Question for Will - do we have a PMU counter for WFE? (don't ask why ;))

Step away from the CPU! There's nothing architected for counting wfe
instructions so, although a CPU might expose such an event, software can't
rely on it.

Will

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-22 13:57         ` Raghavendra K T
@ 2013-07-28 20:55           ` Christoffer Dall
  -1 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-28 20:55 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Raghavendra KT, Marc Zyngier, linux-arm-kernel, kvm, kvmarm,
	Catalin Marinas, Will Deacon

On Mon, Jul 22, 2013 at 07:27:58PM +0530, Raghavendra K T wrote:
> On 07/22/2013 06:21 PM, Christoffer Dall wrote:
> >On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> >>On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>So far, when a guest executes WFE (like when waiting for a spinlock
> >>>to become unlocked), we don't do a thing and let it run uninterrupted.
> >>>
> >>>Another option is to trap a blocking WFE and offer the opportunity
> >>>to the scheduler to switch to another task, potentially giving the
> >>>vcpu holding the spinlock a chance to run sooner.
> >>>
> >>
> >>Idea looks to be correct from my experiments on x86. It does bring some
> >>percentage of benefits in overcommitted guests. Infact,
> >>
> >>https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> >>(this results in using ple handler heuristics in vcpu_block pach).
> >
> >What about the adverse effect in the non-overcommitted case?
> >
> 
> Ideally is should fail to schedule any other task and comeback to halt
> loop. This should not hurt AFAICS. But I agree that, numbers needed to
> support this argument.

So if two VCPUs are scheduled on two PCPUs and the waiting VCPU would
normally wait, say, 1000 cycles to grab the lock, the latency for
grabbing the lock will now be (at least) a couple of thousand cycles
even for a tight switch back into the host and back into the guest (on
currently available hardware).

> 
> For x86, I had seen no side effects with the experiments.
> 

I suspect some workloads on x86 would indeed show some side effects, but
much smaller on ARM, since x86 has a much more hardware-optimized VMEXIT
cycle time on relatively recent CPUs.

-Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-28 20:55           ` Christoffer Dall
  0 siblings, 0 replies; 42+ messages in thread
From: Christoffer Dall @ 2013-07-28 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 22, 2013 at 07:27:58PM +0530, Raghavendra K T wrote:
> On 07/22/2013 06:21 PM, Christoffer Dall wrote:
> >On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
> >>On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>So far, when a guest executes WFE (like when waiting for a spinlock
> >>>to become unlocked), we don't do a thing and let it run uninterrupted.
> >>>
> >>>Another option is to trap a blocking WFE and offer the opportunity
> >>>to the scheduler to switch to another task, potentially giving the
> >>>vcpu holding the spinlock a chance to run sooner.
> >>>
> >>
> >>Idea looks to be correct from my experiments on x86. It does bring some
> >>percentage of benefits in overcommitted guests. Infact,
> >>
> >>https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> >>(this results in using ple handler heuristics in vcpu_block pach).
> >
> >What about the adverse effect in the non-overcommitted case?
> >
> 
> Ideally is should fail to schedule any other task and comeback to halt
> loop. This should not hurt AFAICS. But I agree that, numbers needed to
> support this argument.

So if two VCPUs are scheduled on two PCPUs and the waiting VCPU would
normally wait, say, 1000 cycles to grab the lock, the latency for
grabbing the lock will now be (at least) a couple of thousand cycles
even for a tight switch back into the host and back into the guest (on
currently available hardware).

> 
> For x86, I had seen no side effects with the experiments.
> 

I suspect some workloads on x86 would indeed show some side effects, but
much smaller on ARM, since x86 has a much more hardware-optimized VMEXIT
cycle time on relatively recent CPUs.

-Christoffer

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
  2013-07-28 20:55           ` Christoffer Dall
@ 2013-07-29  7:35             ` Raghavendra K T
  -1 siblings, 0 replies; 42+ messages in thread
From: Raghavendra K T @ 2013-07-29  7:35 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Raghavendra KT, Marc Zyngier, linux-arm-kernel, kvm, kvmarm,
	Catalin Marinas, Will Deacon

On 07/29/2013 02:25 AM, Christoffer Dall wrote:
> On Mon, Jul 22, 2013 at 07:27:58PM +0530, Raghavendra K T wrote:
>> On 07/22/2013 06:21 PM, Christoffer Dall wrote:
>>> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
>>>> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>> So far, when a guest executes WFE (like when waiting for a spinlock
>>>>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>>>>
>>>>> Another option is to trap a blocking WFE and offer the opportunity
>>>>> to the scheduler to switch to another task, potentially giving the
>>>>> vcpu holding the spinlock a chance to run sooner.
>>>>>
>>>>
>>>> Idea looks to be correct from my experiments on x86. It does bring some
>>>> percentage of benefits in overcommitted guests. Infact,
>>>>
>>>> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
>>>> (this results in using ple handler heuristics in vcpu_block pach).
>>>
>>> What about the adverse effect in the non-overcommitted case?
>>>
>>
>> Ideally is should fail to schedule any other task and comeback to halt
>> loop. This should not hurt AFAICS. But I agree that, numbers needed to
>> support this argument.
>
> So if two VCPUs are scheduled on two PCPUs and the waiting VCPU would
> normally wait, say, 1000 cycles to grab the lock, the latency for
> grabbing the lock will now be (at least) a couple of thousand cycles
> even for a tight switch back into the host and back into the guest (on
> currently available hardware).
>

I agree that unnecessary vmexits increase the latency.

>>
>> For x86, I had seen no side effects with the experiments.
>>
>
> I suspect some workloads on x86 would indeed show some side effects, but
> much smaller on ARM, since x86 has a much more hardware-optimized VMEXIT
> cycle time on relatively recent CPUs.
>

I think I should have clearly explained what was tried in x86. sorry
for confusion.

in x86, what I tried was in the halt handler,
instead of doing simple schedule() do intelligent directed yields, using
already available ple handler.
ple handler does have some undercommit detection logic to return back
also the halt() was triggered by guest only after spinning enough in
pv-spinlocks (which was not normal otherwise).
So there was around 2-3% improvement overall in x86.
But yes, I am not expert to comment on arm ecosystem , though I liked 
the idea. and finally only numbers should prove as always.. :).



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE
@ 2013-07-29  7:35             ` Raghavendra K T
  0 siblings, 0 replies; 42+ messages in thread
From: Raghavendra K T @ 2013-07-29  7:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/29/2013 02:25 AM, Christoffer Dall wrote:
> On Mon, Jul 22, 2013 at 07:27:58PM +0530, Raghavendra K T wrote:
>> On 07/22/2013 06:21 PM, Christoffer Dall wrote:
>>> On 22 July 2013 10:53, Raghavendra KT <raghavendra.kt.linux@gmail.com> wrote:
>>>> On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>> So far, when a guest executes WFE (like when waiting for a spinlock
>>>>> to become unlocked), we don't do a thing and let it run uninterrupted.
>>>>>
>>>>> Another option is to trap a blocking WFE and offer the opportunity
>>>>> to the scheduler to switch to another task, potentially giving the
>>>>> vcpu holding the spinlock a chance to run sooner.
>>>>>
>>>>
>>>> Idea looks to be correct from my experiments on x86. It does bring some
>>>> percentage of benefits in overcommitted guests. Infact,
>>>>
>>>> https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
>>>> (this results in using ple handler heuristics in vcpu_block pach).
>>>
>>> What about the adverse effect in the non-overcommitted case?
>>>
>>
>> Ideally is should fail to schedule any other task and comeback to halt
>> loop. This should not hurt AFAICS. But I agree that, numbers needed to
>> support this argument.
>
> So if two VCPUs are scheduled on two PCPUs and the waiting VCPU would
> normally wait, say, 1000 cycles to grab the lock, the latency for
> grabbing the lock will now be (at least) a couple of thousand cycles
> even for a tight switch back into the host and back into the guest (on
> currently available hardware).
>

I agree that unnecessary vmexits increase the latency.

>>
>> For x86, I had seen no side effects with the experiments.
>>
>
> I suspect some workloads on x86 would indeed show some side effects, but
> much smaller on ARM, since x86 has a much more hardware-optimized VMEXIT
> cycle time on relatively recent CPUs.
>

I think I should have clearly explained what was tried in x86. sorry
for confusion.

in x86, what I tried was in the halt handler,
instead of doing simple schedule() do intelligent directed yields, using
already available ple handler.
ple handler does have some undercommit detection logic to return back
also the halt() was triggered by guest only after spinning enough in
pv-spinlocks (which was not normal otherwise).
So there was around 2-3% improvement overall in x86.
But yes, I am not expert to comment on arm ecosystem , though I liked 
the idea. and finally only numbers should prove as always.. :).

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2013-07-29  7:35 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-19 13:53 [PATCH 0/4] KVM/arm64 fixes for 3.11 Marc Zyngier
2013-07-19 13:53 ` Marc Zyngier
2013-07-19 13:53 ` [PATCH 1/4] arm64: KVM: perform save/restore of PAR_EL1 Marc Zyngier
2013-07-19 13:53   ` Marc Zyngier
2013-07-20 21:51   ` Christoffer Dall
2013-07-20 21:51     ` Christoffer Dall
2013-07-19 13:53 ` [PATCH 2/4] arm64: KVM: add missing dsb before invalidating Stage-2 TLBs Marc Zyngier
2013-07-19 13:53   ` Marc Zyngier
2013-07-19 14:32   ` Will Deacon
2013-07-19 14:32     ` Will Deacon
2013-07-19 14:53     ` Marc Zyngier
2013-07-19 14:53       ` Marc Zyngier
2013-07-19 13:53 ` [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE Marc Zyngier
2013-07-19 13:53   ` Marc Zyngier
2013-07-19 14:25   ` Will Deacon
2013-07-19 14:25     ` Will Deacon
2013-07-19 14:29     ` Marc Zyngier
2013-07-19 14:29       ` Marc Zyngier
2013-07-20 22:04   ` Christoffer Dall
2013-07-20 22:04     ` Christoffer Dall
2013-07-22  7:36   ` Gleb Natapov
2013-07-22  7:36     ` Gleb Natapov
2013-07-22  8:53   ` Raghavendra KT
2013-07-22  8:53     ` Raghavendra KT
2013-07-22 12:51     ` Christoffer Dall
2013-07-22 12:51       ` Christoffer Dall
2013-07-22 13:01       ` Will Deacon
2013-07-22 13:01         ` Will Deacon
2013-07-22 13:57       ` Raghavendra K T
2013-07-22 13:57         ` Raghavendra K T
2013-07-28 20:55         ` Christoffer Dall
2013-07-28 20:55           ` Christoffer Dall
2013-07-29  7:35           ` Raghavendra K T
2013-07-29  7:35             ` Raghavendra K T
2013-07-23 10:41       ` Catalin Marinas
2013-07-23 10:41         ` Catalin Marinas
2013-07-23 16:04         ` Will Deacon
2013-07-23 16:04           ` Will Deacon
2013-07-19 13:53 ` [PATCH 4/4] arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S Marc Zyngier
2013-07-19 13:53   ` Marc Zyngier
2013-07-22  7:36   ` Gleb Natapov
2013-07-22  7:36     ` Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.