linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement
@ 2020-03-26  2:19 Wanpeng Li
  2020-03-26  2:20 ` [PATCH 1/3] KVM: X86: Delay read msr data iff writes ICR MSR Wanpeng Li
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Wanpeng Li @ 2020-03-26  2:19 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

The original single target IPI fastpath patch forgot to filter the 
ICR destination shorthand field. Multicast IPI is not suitable for 
this feature since wakeup the multiple sleeping vCPUs will extend 
the interrupt disabled time, it especially worse in the over-subscribe 
and VM has a little bit more vCPUs scenario. Let's narrow it down to 
single target IPI. In addition, this patchset micro-optimize virtual 
IPI emulation sequence for fastpath.

Wanpeng Li (3):
  KVM: X86: Delay read msr data iff writes ICR MSR
  KVM: X86: Narrow down the IPI fastpath to single target IPI
  KVM: X86: Micro-optimize IPI fastpath delay

 arch/x86/kvm/lapic.c |  4 ++--
 arch/x86/kvm/lapic.h |  1 +
 arch/x86/kvm/x86.c   | 14 +++++++++++---
 3 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] KVM: X86: Delay read msr data iff writes ICR MSR
  2020-03-26  2:19 [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Wanpeng Li
@ 2020-03-26  2:20 ` Wanpeng Li
  2020-03-26  2:20 ` [PATCH 2/3] KVM: X86: Narrow down the IPI fastpath to single target IPI Wanpeng Li
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2020-03-26  2:20 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

From: Wanpeng Li <wanpengli@tencent.com>

Delay read msr data until we identify guest accesses ICR MSR to avoid
to penalize all other MSR writes.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3156e25..9232b15 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1568,11 +1568,12 @@ static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data
 enum exit_fastpath_completion handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu)
 {
 	u32 msr = kvm_rcx_read(vcpu);
-	u64 data = kvm_read_edx_eax(vcpu);
+	u64 data;
 	int ret = 0;
 
 	switch (msr) {
 	case APIC_BASE_MSR + (APIC_ICR >> 4):
+		data = kvm_read_edx_eax(vcpu);
 		ret = handle_fastpath_set_x2apic_icr_irqoff(vcpu, data);
 		break;
 	default:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] KVM: X86: Narrow down the IPI fastpath to single target IPI
  2020-03-26  2:19 [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Wanpeng Li
  2020-03-26  2:20 ` [PATCH 1/3] KVM: X86: Delay read msr data iff writes ICR MSR Wanpeng Li
@ 2020-03-26  2:20 ` Wanpeng Li
  2020-03-26  2:20 ` [PATCH 3/3] KVM: X86: Micro-optimize IPI fastpath delay Wanpeng Li
  2020-03-26  9:46 ` [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Paolo Bonzini
  3 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2020-03-26  2:20 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

From: Wanpeng Li <wanpengli@tencent.com>

The original single target IPI fastpath patch forgot to filter the 
ICR destination shorthand field. Multicast IPI is not suitable for 
this feature since wakeup the multiple sleeping vCPUs will extend 
the interrupt disabled time, it especially worse in the over-subscribe 
and VM has a little bit more vCPUs scenario. Let's narrow it down to 
single target IPI.

Two VMs, each is 76 vCPUs, one running 'ebizzy -M', the other 
running cyclictest on all vCPUs, w/ this patch, the avg score 
of cyclictest can improve more than 5%. (pv tlb, pv ipi, pv 
sched yield are disabled during testing to avoid the disturb).

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/x86.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9232b15..50ef1c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1554,7 +1554,10 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
  */
 static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
 {
-	if (lapic_in_kernel(vcpu) && apic_x2apic_mode(vcpu->arch.apic) &&
+	if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(vcpu->arch.apic))
+		return 1;
+
+	if (((data & APIC_SHORT_MASK) == APIC_DEST_NOSHORT) &&
 		((data & APIC_DEST_MASK) == APIC_DEST_PHYSICAL) &&
 		((data & APIC_MODE_MASK) == APIC_DM_FIXED)) {
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] KVM: X86: Micro-optimize IPI fastpath delay
  2020-03-26  2:19 [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Wanpeng Li
  2020-03-26  2:20 ` [PATCH 1/3] KVM: X86: Delay read msr data iff writes ICR MSR Wanpeng Li
  2020-03-26  2:20 ` [PATCH 2/3] KVM: X86: Narrow down the IPI fastpath to single target IPI Wanpeng Li
@ 2020-03-26  2:20 ` Wanpeng Li
  2020-03-26  9:46 ` [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Paolo Bonzini
  3 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2020-03-26  2:20 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

From: Wanpeng Li <wanpengli@tencent.com>

This patch optimizes the virtual IPI fastpath emulation sequence:

write ICR2                          send virtual IPI
read ICR2                           write ICR2
send virtual IPI         ==>        write ICR
write ICR

We can observe ~0.67% performance improvement for IPI microbenchmark
(https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/) 
on Skylake server.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/lapic.c | 4 ++--
 arch/x86/kvm/lapic.h | 1 +
 arch/x86/kvm/x86.c   | 6 +++++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e3099c6..338de38 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1226,7 +1226,7 @@ void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
 }
 EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
 
-static void apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
+void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
 {
 	struct kvm_lapic_irq irq;
 
@@ -1940,7 +1940,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 	case APIC_ICR:
 		/* No delay here, so we always clear the pending bit */
 		val &= ~(1 << 12);
-		apic_send_ipi(apic, val, kvm_lapic_get_reg(apic, APIC_ICR2));
+		kvm_apic_send_ipi(apic, val, kvm_lapic_get_reg(apic, APIC_ICR2));
 		kvm_lapic_set_reg(apic, APIC_ICR, val);
 		break;
 
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index ec6fbfe..bc76860 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -95,6 +95,7 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
+void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high);
 
 u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
 int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50ef1c5..c4bb7d8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1561,8 +1561,12 @@ static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data
 		((data & APIC_DEST_MASK) == APIC_DEST_PHYSICAL) &&
 		((data & APIC_MODE_MASK) == APIC_DM_FIXED)) {
 
+		data &= ~(1 << 12);
+		kvm_apic_send_ipi(vcpu->arch.apic, (u32)data, (u32)(data >> 32));
 		kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR2, (u32)(data >> 32));
-		return kvm_lapic_reg_write(vcpu->arch.apic, APIC_ICR, (u32)data);
+		kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR, (u32)data);
+		trace_kvm_apic_write(APIC_ICR, (u32)data);
+		return 0;
 	}
 
 	return 1;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement
  2020-03-26  2:19 [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Wanpeng Li
                   ` (2 preceding siblings ...)
  2020-03-26  2:20 ` [PATCH 3/3] KVM: X86: Micro-optimize IPI fastpath delay Wanpeng Li
@ 2020-03-26  9:46 ` Paolo Bonzini
  3 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2020-03-26  9:46 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On 26/03/20 03:19, Wanpeng Li wrote:
> The original single target IPI fastpath patch forgot to filter the 
> ICR destination shorthand field. Multicast IPI is not suitable for 
> this feature since wakeup the multiple sleeping vCPUs will extend 
> the interrupt disabled time, it especially worse in the over-subscribe 
> and VM has a little bit more vCPUs scenario. Let's narrow it down to 
> single target IPI. In addition, this patchset micro-optimize virtual 
> IPI emulation sequence for fastpath.
> 
> Wanpeng Li (3):
>   KVM: X86: Delay read msr data iff writes ICR MSR
>   KVM: X86: Narrow down the IPI fastpath to single target IPI
>   KVM: X86: Micro-optimize IPI fastpath delay
> 
>  arch/x86/kvm/lapic.c |  4 ++--
>  arch/x86/kvm/lapic.h |  1 +
>  arch/x86/kvm/x86.c   | 14 +++++++++++---
>  3 files changed, 14 insertions(+), 5 deletions(-)
> 

Queued 2 for 5.6 and 1-3 for 5.7, thanks.

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-26  9:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26  2:19 [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Wanpeng Li
2020-03-26  2:20 ` [PATCH 1/3] KVM: X86: Delay read msr data iff writes ICR MSR Wanpeng Li
2020-03-26  2:20 ` [PATCH 2/3] KVM: X86: Narrow down the IPI fastpath to single target IPI Wanpeng Li
2020-03-26  2:20 ` [PATCH 3/3] KVM: X86: Micro-optimize IPI fastpath delay Wanpeng Li
2020-03-26  9:46 ` [PATCH 0/3] KVM: X86: Single target IPI fastpath enhancement Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).