* [PATCH 1/2] KVM: X86: Single target IPI fastpath @ 2019-11-09 7:05 Wanpeng Li 2019-11-09 7:05 ` [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery Wanpeng Li ` (5 more replies) 0 siblings, 6 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-09 7:05 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel From: Wanpeng Li <wanpengli@tencent.com> This patch tries to optimize x2apic physical destination mode, fixed delivery mode single target IPI by delivering IPI to receiver immediately after sender writes ICR vmexit to avoid various checks when possible. Testing on Xeon Skylake server: The virtual IPI latency from sender send to receiver receive reduces more than 330+ cpu cycles. Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE caused vmexit reduces more than 1000+ cpu cycles: Before patch: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us After patch: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- arch/x86/kvm/vmx/vmx.c | 39 +++++++++++++++++++++++++++++++++++++-- include/linux/kvm_host.h | 1 + 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5d21a4a..5c67061 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5924,7 +5924,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) } } - if (exit_reason < kvm_vmx_max_exit_handlers + if (vcpu->fast_vmexit) + return 1; + else if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu); else { @@ -6474,6 +6476,34 @@ void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp) bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, bool launched); +static int handle_ipi_fastpath(struct kvm_vcpu *vcpu) +{ + u32 index; + u64 data; + int ret = 0; + + if (lapic_in_kernel(vcpu) && apic_x2apic_mode(vcpu->arch.apic)) { + /* + * fastpath to IPI target + */ + index = kvm_rcx_read(vcpu); + data = kvm_read_edx_eax(vcpu); + + if (((index - APIC_BASE_MSR) << 4 == APIC_ICR) && + ((data & KVM_APIC_DEST_MASK) == APIC_DEST_PHYSICAL) && + ((data & APIC_MODE_MASK) == APIC_DM_FIXED)) { + + kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR2, (u32)(data >> 32)); + ret = kvm_lapic_reg_write(vcpu->arch.apic, APIC_ICR, (u32)data); + + if (ret == 0) + ret = kvm_skip_emulated_instruction(vcpu); + }; + }; + + return ret; +} + static void vmx_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6615,6 +6645,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) | (1 << VCPU_EXREG_CR3)); vcpu->arch.regs_dirty = 0; + vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); + vcpu->fast_vmexit = false; + if (!is_guest_mode(vcpu) && + vmx->exit_reason == EXIT_REASON_MSR_WRITE) + vcpu->fast_vmexit = handle_ipi_fastpath(vcpu); + pt_guest_exit(vmx); /* @@ -6634,7 +6670,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) vmx->nested.nested_run_pending = 0; vmx->idt_vectoring_info = 0; - vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY) kvm_machine_check(); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 719fc3e..7a7358b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -319,6 +319,7 @@ struct kvm_vcpu { #endif bool preempted; bool ready; + bool fast_vmexit; struct kvm_vcpu_arch arch; struct dentry *debugfs_dentry; }; -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li @ 2019-11-09 7:05 ` Wanpeng Li 2019-11-11 21:59 ` Paolo Bonzini 2019-11-10 1:59 ` kbuild test robot ` (4 subsequent siblings) 5 siblings, 1 reply; 19+ messages in thread From: Wanpeng Li @ 2019-11-09 7:05 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel From: Wanpeng Li <wanpengli@tencent.com> After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining cause of vmexits observed in product environment which can't be optimized by PV IPIs. This patch is the follow-up on commit 0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize redundancy logic before fixed mode ipi is delivered in the fast path. - broadcast handling needs to go slow path, so the delivery mode repair can be delayed to before slow path. - self-IPI will not be intervened by hypervisor any more after APICv is introduced and the boxes support APICv are popular now. In addition, kvm_apic_map_get_dest_lapic() can handle the self-IPI, so there is no need a shortcut for the non-APICv case. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- arch/x86/kvm/irq_comm.c | 6 +++--- arch/x86/kvm/lapic.c | 5 ----- 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index 8ecd48d..aa88156 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -52,15 +52,15 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]; unsigned int dest_vcpus = 0; + if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map)) + return r; + if (irq->dest_mode == 0 && irq->dest_id == 0xff && kvm_lowest_prio_delivery(irq)) { printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n"); irq->delivery_mode = APIC_DM_FIXED; } - if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map)) - return r; - memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap)); kvm_for_each_vcpu(i, vcpu, kvm) { diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index b29d00b..ea936fa 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -951,11 +951,6 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src, *r = -1; - if (irq->shorthand == APIC_DEST_SELF) { - *r = kvm_apic_set_irq(src->vcpu, irq, dest_map); - return true; - } - rcu_read_lock(); map = rcu_dereference(kvm->arch.apic_map); -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery 2019-11-09 7:05 ` [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery Wanpeng Li @ 2019-11-11 21:59 ` Paolo Bonzini 2019-11-12 1:34 ` Wanpeng Li 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2019-11-11 21:59 UTC (permalink / raw) To: Wanpeng Li, linux-kernel, kvm Cc: Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On 09/11/19 08:05, Wanpeng Li wrote: > From: Wanpeng Li <wanpengli@tencent.com> > > After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and > CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining > cause of vmexits observed in product environment which can't be > optimized by PV IPIs. This patch is the follow-up on commit > 0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize > redundancy logic before fixed mode ipi is delivered in the fast > path. > > - broadcast handling needs to go slow path, so the delivery mode repair > can be delayed to before slow path. I agree with this part, but is the cost of the irq->shorthand check really measurable? Paolo > - self-IPI will not be intervened by hypervisor any more after APICv is > introduced and the boxes support APICv are popular now. In addition, > kvm_apic_map_get_dest_lapic() can handle the self-IPI, so there is no > need a shortcut for the non-APICv case. > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > --- > arch/x86/kvm/irq_comm.c | 6 +++--- > arch/x86/kvm/lapic.c | 5 ----- > 2 files changed, 3 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c > index 8ecd48d..aa88156 100644 > --- a/arch/x86/kvm/irq_comm.c > +++ b/arch/x86/kvm/irq_comm.c > @@ -52,15 +52,15 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, > unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]; > unsigned int dest_vcpus = 0; > > + if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map)) > + return r; > + > if (irq->dest_mode == 0 && irq->dest_id == 0xff && > kvm_lowest_prio_delivery(irq)) { > printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n"); > irq->delivery_mode = APIC_DM_FIXED; > } > > - if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map)) > - return r; > - > memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap)); > > kvm_for_each_vcpu(i, vcpu, kvm) { > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index b29d00b..ea936fa 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -951,11 +951,6 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src, > > *r = -1; > > - if (irq->shorthand == APIC_DEST_SELF) { > - *r = kvm_apic_set_irq(src->vcpu, irq, dest_map); > - return true; > - } > - > rcu_read_lock(); > map = rcu_dereference(kvm->arch.apic_map); > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery 2019-11-11 21:59 ` Paolo Bonzini @ 2019-11-12 1:34 ` Wanpeng Li 0 siblings, 0 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-12 1:34 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 09/11/19 08:05, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > > > After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and > > CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining > > cause of vmexits observed in product environment which can't be > > optimized by PV IPIs. This patch is the follow-up on commit > > 0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize > > redundancy logic before fixed mode ipi is delivered in the fast > > path. > > > > - broadcast handling needs to go slow path, so the delivery mode repair > > can be delayed to before slow path. > > I agree with this part, but is the cost of the irq->shorthand check > really measurable? I can drop the second part for v2. Wanpeng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li @ 2019-11-10 1:59 ` kbuild test robot 2019-11-10 1:59 ` kbuild test robot ` (4 subsequent siblings) 5 siblings, 0 replies; 19+ messages in thread From: kbuild test robot @ 2019-11-10 1:59 UTC (permalink / raw) To: Wanpeng Li Cc: kbuild-all, linux-kernel, kvm, Paolo Bonzini, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel Hi Wanpeng, I love your patch! Perhaps something to improve: [auto build test WARNING on kvm/linux-next] [also build test WARNING on v5.4-rc6 next-20191108] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Wanpeng-Li/KVM-X86-Single-target-IPI-fastpath/20191110-081303 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> coccinelle warnings: (new ones prefixed by >>) >> arch/x86/kvm/vmx/vmx.c:6509:3-4: Unneeded semicolon arch/x86/kvm/vmx/vmx.c:6510:2-3: Unneeded semicolon Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath @ 2019-11-10 1:59 ` kbuild test robot 0 siblings, 0 replies; 19+ messages in thread From: kbuild test robot @ 2019-11-10 1:59 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1074 bytes --] Hi Wanpeng, I love your patch! Perhaps something to improve: [auto build test WARNING on kvm/linux-next] [also build test WARNING on v5.4-rc6 next-20191108] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Wanpeng-Li/KVM-X86-Single-target-IPI-fastpath/20191110-081303 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> coccinelle warnings: (new ones prefixed by >>) >> arch/x86/kvm/vmx/vmx.c:6509:3-4: Unneeded semicolon arch/x86/kvm/vmx/vmx.c:6510:2-3: Unneeded semicolon Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org Intel Corporation ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] KVM: X86: fix semicolon.cocci warnings 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li @ 2019-11-10 1:59 ` kbuild test robot 2019-11-10 1:59 ` kbuild test robot ` (4 subsequent siblings) 5 siblings, 0 replies; 19+ messages in thread From: kbuild test robot @ 2019-11-10 1:59 UTC (permalink / raw) To: Wanpeng Li Cc: kbuild-all, linux-kernel, kvm, Paolo Bonzini, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel From: kbuild test robot <lkp@intel.com> arch/x86/kvm/vmx/vmx.c:6509:3-4: Unneeded semicolon arch/x86/kvm/vmx/vmx.c:6510:2-3: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Fixes: fbb78ac80ca1 ("KVM: X86: Single target IPI fastpath") CC: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: kbuild test robot <lkp@intel.com> --- url: https://github.com/0day-ci/linux/commits/Wanpeng-Li/KVM-X86-Single-target-IPI-fastpath/20191110-081303 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next vmx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6506,8 +6506,8 @@ static int handle_ipi_fastpath(struct kv if (ret == 0) ret = kvm_skip_emulated_instruction(vcpu); - }; - }; + } + } return ret; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] KVM: X86: fix semicolon.cocci warnings @ 2019-11-10 1:59 ` kbuild test robot 0 siblings, 0 replies; 19+ messages in thread From: kbuild test robot @ 2019-11-10 1:59 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 893 bytes --] From: kbuild test robot <lkp@intel.com> arch/x86/kvm/vmx/vmx.c:6509:3-4: Unneeded semicolon arch/x86/kvm/vmx/vmx.c:6510:2-3: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Fixes: fbb78ac80ca1 ("KVM: X86: Single target IPI fastpath") CC: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: kbuild test robot <lkp@intel.com> --- url: https://github.com/0day-ci/linux/commits/Wanpeng-Li/KVM-X86-Single-target-IPI-fastpath/20191110-081303 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next vmx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6506,8 +6506,8 @@ static int handle_ipi_fastpath(struct kv if (ret == 0) ret = kvm_skip_emulated_instruction(vcpu); - }; - }; + } + } return ret; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li ` (2 preceding siblings ...) 2019-11-10 1:59 ` kbuild test robot @ 2019-11-11 13:06 ` Vitaly Kuznetsov 2019-11-12 1:18 ` Wanpeng Li 2019-11-11 21:59 ` Paolo Bonzini 2019-11-14 11:58 ` Paolo Bonzini 5 siblings, 1 reply; 19+ messages in thread From: Vitaly Kuznetsov @ 2019-11-11 13:06 UTC (permalink / raw) To: Wanpeng Li Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, linux-kernel, kvm Wanpeng Li <kernellwp@gmail.com> writes: > + > static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -6615,6 +6645,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > | (1 << VCPU_EXREG_CR3)); > vcpu->arch.regs_dirty = 0; > > + vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > + vcpu->fast_vmexit = false; > + if (!is_guest_mode(vcpu) && > + vmx->exit_reason == EXIT_REASON_MSR_WRITE) > + vcpu->fast_vmexit = handle_ipi_fastpath(vcpu); I have to admit this looks too much to me :-( Yes, I see the benefits of taking a shortcut (by actualy penalizing all other MSR writes) but the question I have is: where do we stop? Also, this 'shortcut' creates an imbalance in tracing: you don't go down to kvm_emulate_wrmsr() so handle_ipi_fastpath() should probably gain a tracepoint. Looking at 'fast_vmexit' name makes me think this is something generic. Is this so? Maybe we can create some sort of an infrastructure for fast vmexit handling and make it easy to hook things up to it? (Maybe it's just me who thinks the codebase is already too complex, let's see what Paolo and other reviewers have to say). -- Vitaly ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-11 13:06 ` [PATCH 1/2] KVM: X86: Single target IPI fastpath Vitaly Kuznetsov @ 2019-11-12 1:18 ` Wanpeng Li 0 siblings, 0 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-12 1:18 UTC (permalink / raw) To: Vitaly Kuznetsov Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, LKML, kvm On Mon, 11 Nov 2019 at 21:06, Vitaly Kuznetsov <vkuznets@redhat.com> wrote: > > Wanpeng Li <kernellwp@gmail.com> writes: > > > + > > static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > > { > > struct vcpu_vmx *vmx = to_vmx(vcpu); > > @@ -6615,6 +6645,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > > | (1 << VCPU_EXREG_CR3)); > > vcpu->arch.regs_dirty = 0; > > > > + vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > > + vcpu->fast_vmexit = false; > > + if (!is_guest_mode(vcpu) && > > + vmx->exit_reason == EXIT_REASON_MSR_WRITE) > > + vcpu->fast_vmexit = handle_ipi_fastpath(vcpu); > > I have to admit this looks too much to me :-( Yes, I see the benefits of > taking a shortcut (by actualy penalizing all other MSR writes) but the > question I have is: where do we stop? In our iaas environment observation, ICR and TSCDEADLINE are the main MSR write vmexits. Before patch: tscdeadline_immed 3900 tscdeadline 5413 After patch: tscdeadline_immed 3912 tscdeadline 5427 So the penalize can be tolerated. > > Also, this 'shortcut' creates an imbalance in tracing: you don't go down > to kvm_emulate_wrmsr() so handle_ipi_fastpath() should probably gain a > tracepoint. Agreed. > > Looking at 'fast_vmexit' name makes me think this is something > generic. Is this so? Maybe we can create some sort of an infrastructure > for fast vmexit handling and make it easy to hook things up to it? Maybe an indirect jump? But I can have a try. Wanpeng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li ` (3 preceding siblings ...) 2019-11-11 13:06 ` [PATCH 1/2] KVM: X86: Single target IPI fastpath Vitaly Kuznetsov @ 2019-11-11 21:59 ` Paolo Bonzini 2019-11-12 1:33 ` Wanpeng Li 2019-11-14 11:58 ` Paolo Bonzini 5 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2019-11-11 21:59 UTC (permalink / raw) To: Wanpeng Li, linux-kernel, kvm Cc: Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On 09/11/19 08:05, Wanpeng Li wrote: > From: Wanpeng Li <wanpengli@tencent.com> > > This patch tries to optimize x2apic physical destination mode, fixed delivery > mode single target IPI by delivering IPI to receiver immediately after sender > writes ICR vmexit to avoid various checks when possible. > > Testing on Xeon Skylake server: > > The virtual IPI latency from sender send to receiver receive reduces more than > 330+ cpu cycles. > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE > caused vmexit reduces more than 1000+ cpu cycles: > > Before patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us > > After patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us Do you have retpolines enabled? The bulk of the speedup might come just from the indirect jump. Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-11 21:59 ` Paolo Bonzini @ 2019-11-12 1:33 ` Wanpeng Li 2019-11-13 6:05 ` Wanpeng Li 2019-11-14 3:12 ` Wanpeng Li 0 siblings, 2 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-12 1:33 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 09/11/19 08:05, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > > > This patch tries to optimize x2apic physical destination mode, fixed delivery > > mode single target IPI by delivering IPI to receiver immediately after sender > > writes ICR vmexit to avoid various checks when possible. > > > > Testing on Xeon Skylake server: > > > > The virtual IPI latency from sender send to receiver receive reduces more than > > 330+ cpu cycles. > > > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE > > caused vmexit reduces more than 1000+ cpu cycles: > > > > Before patch: > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us > > > > After patch: > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us > > Do you have retpolines enabled? The bulk of the speedup might come just > from the indirect jump. Adding 'mitigations=off' to the host grub parameter: Before patch: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time MSR_WRITE 2681713 92.98% 77.52% 0.38us 18.54us 0.73us ( +- 0.02% ) After patch: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time MSR_WRITE 2953447 92.48% 62.47% 0.30us 59.09us 0.40us ( +- 0.02% ) Actually, this is not the first attempt to add shortcut for MSR writes which performance sensitive, the other effort is tscdeadline timer from Isaku Yamahata, https://patchwork.kernel.org/cover/10541035/ , ICR and TSCDEADLINE MSR writes cause the main MSR write vmexits in our product observation, multicast IPIs are not as common as unicast IPI like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. As far as I know, something similar to this patch has already been deployed in some cloud companies private kvm fork. Wanpeng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-12 1:33 ` Wanpeng Li @ 2019-11-13 6:05 ` Wanpeng Li 2019-11-14 3:12 ` Wanpeng Li 1 sibling, 0 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-13 6:05 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Tue, 12 Nov 2019 at 09:33, Wanpeng Li <kernellwp@gmail.com> wrote: > > On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 09/11/19 08:05, Wanpeng Li wrote: > > > From: Wanpeng Li <wanpengli@tencent.com> > > > > > > This patch tries to optimize x2apic physical destination mode, fixed delivery > > > mode single target IPI by delivering IPI to receiver immediately after sender > > > writes ICR vmexit to avoid various checks when possible. > > > > > > Testing on Xeon Skylake server: > > > > > > The virtual IPI latency from sender send to receiver receive reduces more than > > > 330+ cpu cycles. > > > > > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE > > > caused vmexit reduces more than 1000+ cpu cycles: > > > > > > Before patch: > > > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us > > > > > > After patch: > > > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us > > > > Do you have retpolines enabled? The bulk of the speedup might come just > > from the indirect jump. > > Adding 'mitigations=off' to the host grub parameter: > > Before patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 2681713 92.98% 77.52% 0.38us 18.54us > 0.73us ( +- 0.02% ) > > After patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 2953447 92.48% 62.47% 0.30us 59.09us > 0.40us ( +- 0.02% ) > > Actually, this is not the first attempt to add shortcut for MSR writes > which performance sensitive, the other effort is tscdeadline timer > from Isaku Yamahata, https://patchwork.kernel.org/cover/10541035/ , > ICR and TSCDEADLINE MSR writes cause the main MSR write vmexits in our > product observation, multicast IPIs are not as common as unicast IPI > like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. As far as > I know, something similar to this patch has already been deployed in > some cloud companies private kvm fork. Hi Paolo, Do you think I should continue for this? Wanpeng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-12 1:33 ` Wanpeng Li 2019-11-13 6:05 ` Wanpeng Li @ 2019-11-14 3:12 ` Wanpeng Li 1 sibling, 0 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-14 3:12 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Tue, 12 Nov 2019 at 09:33, Wanpeng Li <kernellwp@gmail.com> wrote: > > On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 09/11/19 08:05, Wanpeng Li wrote: > > > From: Wanpeng Li <wanpengli@tencent.com> > > > > > > This patch tries to optimize x2apic physical destination mode, fixed delivery > > > mode single target IPI by delivering IPI to receiver immediately after sender > > > writes ICR vmexit to avoid various checks when possible. > > > > > > Testing on Xeon Skylake server: > > > > > > The virtual IPI latency from sender send to receiver receive reduces more than > > > 330+ cpu cycles. > > > > > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE > > > caused vmexit reduces more than 1000+ cpu cycles: > > > > > > Before patch: > > > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us > > > > > > After patch: > > > > > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > > > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us > > > > Do you have retpolines enabled? The bulk of the speedup might come just > > from the indirect jump. > > Adding 'mitigations=off' to the host grub parameter: > > Before patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 2681713 92.98% 77.52% 0.38us 18.54us > 0.73us ( +- 0.02% ) > > After patch: > > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time > MSR_WRITE 2953447 92.48% 62.47% 0.30us 59.09us > 0.40us ( +- 0.02% ) Hmm, sender side less vmexit time is due to kvm_exit tracepoint is still left in vmx_handle_exit, and ICR wrmsr is moved ahead, that is why the time between kvm_exit tracepoint and kvm_entry tracepoint is reduced. But the virtual IPI latency still can reduce 330+ cycles. Wanpeng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li ` (4 preceding siblings ...) 2019-11-11 21:59 ` Paolo Bonzini @ 2019-11-14 11:58 ` Paolo Bonzini 2019-11-14 15:22 ` Sean Christopherson 5 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2019-11-14 11:58 UTC (permalink / raw) To: Wanpeng Li, linux-kernel, kvm Cc: Radim Krčmář, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel Ok, it's not _so_ ugly after all. > --- > arch/x86/kvm/vmx/vmx.c | 39 +++++++++++++++++++++++++++++++++++++-- > include/linux/kvm_host.h | 1 + > 2 files changed, 38 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 5d21a4a..5c67061 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5924,7 +5924,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > } > } > > - if (exit_reason < kvm_vmx_max_exit_handlers > + if (vcpu->fast_vmexit) > + return 1; > + else if (exit_reason < kvm_vmx_max_exit_handlers Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason to vmx->exit_reason to -1 if the fast path succeeds. > + if (ret == 0) > + ret = kvm_skip_emulated_instruction(vcpu); Please move the "kvm_skip_emulated_instruction(vcpu)" to vmx_handle_exit, so that this basically is #define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1 if (ret == 0) vcpu->exit_reason = EXIT_REASON_NEED_SKIP_EMULATED_INSN; and handle_ipi_fastpath can return void. Thanks, Paolo > + }; > + }; > + > + return ret; > +} > + > static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -6615,6 +6645,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > | (1 << VCPU_EXREG_CR3)); > vcpu->arch.regs_dirty = 0; > > + vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > + vcpu->fast_vmexit = false; > + if (!is_guest_mode(vcpu) && > + vmx->exit_reason == EXIT_REASON_MSR_WRITE) > + vcpu->fast_vmexit = handle_ipi_fastpath(vcpu); This should be done later, at least after kvm_put_guest_xcr0, because running with partially-loaded guest state is harder to audit. The best place to put it actually is right after the existing vmx->exit_reason assignment, where we already handle EXIT_REASON_MCE_DURING_VMENTRY. > pt_guest_exit(vmx); > > /* > @@ -6634,7 +6670,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > vmx->nested.nested_run_pending = 0; > vmx->idt_vectoring_info = 0; > > - vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY) > kvm_machine_check(); > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 719fc3e..7a7358b 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -319,6 +319,7 @@ struct kvm_vcpu { > #endif > bool preempted; > bool ready; > + bool fast_vmexit; > struct kvm_vcpu_arch arch; > struct dentry *debugfs_dentry; > }; > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-14 11:58 ` Paolo Bonzini @ 2019-11-14 15:22 ` Sean Christopherson 2019-11-14 15:44 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Sean Christopherson @ 2019-11-14 15:22 UTC (permalink / raw) To: Paolo Bonzini Cc: Wanpeng Li, linux-kernel, kvm, Radim Krčmář, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Thu, Nov 14, 2019 at 12:58:56PM +0100, Paolo Bonzini wrote: > Ok, it's not _so_ ugly after all. > > > --- > > arch/x86/kvm/vmx/vmx.c | 39 +++++++++++++++++++++++++++++++++++++-- > > include/linux/kvm_host.h | 1 + > > 2 files changed, 38 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 5d21a4a..5c67061 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -5924,7 +5924,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > > } > > } > > > > - if (exit_reason < kvm_vmx_max_exit_handlers > > + if (vcpu->fast_vmexit) > > + return 1; > > + else if (exit_reason < kvm_vmx_max_exit_handlers > > Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason > to vmx->exit_reason to -1 if the fast path succeeds. Actually, rather than make this super special case, what about moving the handling into vmx_handle_exit_irqoff()? Practically speaking that would only add ~50 cycles (two VMREADs) relative to the code being run right after kvm_put_guest_xcr0(). It has the advantage of restoring the host's hardware breakpoints, preserving a semi-accurate last_guest_tsc, and running with vcpu->mode set back to OUTSIDE_GUEST_MODE. Hopefully it'd also be more intuitive for people unfamiliar with the code. > > > + if (ret == 0) > > + ret = kvm_skip_emulated_instruction(vcpu); > > Please move the "kvm_skip_emulated_instruction(vcpu)" to > vmx_handle_exit, so that this basically is > > #define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1 > > if (ret == 0) > vcpu->exit_reason = EXIT_REASON_NEED_SKIP_EMULATED_INSN; > > and handle_ipi_fastpath can return void. I'd rather we add a dedicated variable to say the exit has already been handled. Overloading exit_reason is bound to cause confusion, and that's probably a best case scenario. > > + }; > > + }; > > + > > + return ret; > > +} > > + > > static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > > { > > struct vcpu_vmx *vmx = to_vmx(vcpu); > > @@ -6615,6 +6645,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > > | (1 << VCPU_EXREG_CR3)); > > vcpu->arch.regs_dirty = 0; > > > > + vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > > + vcpu->fast_vmexit = false; > > + if (!is_guest_mode(vcpu) && > > + vmx->exit_reason == EXIT_REASON_MSR_WRITE) > > + vcpu->fast_vmexit = handle_ipi_fastpath(vcpu); > > This should be done later, at least after kvm_put_guest_xcr0, because > running with partially-loaded guest state is harder to audit. The best > place to put it actually is right after the existing vmx->exit_reason > assignment, where we already handle EXIT_REASON_MCE_DURING_VMENTRY. > > > pt_guest_exit(vmx); > > > > /* > > @@ -6634,7 +6670,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > > vmx->nested.nested_run_pending = 0; > > vmx->idt_vectoring_info = 0; > > > > - vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); > > if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY) > > kvm_machine_check(); > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 719fc3e..7a7358b 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -319,6 +319,7 @@ struct kvm_vcpu { > > #endif > > bool preempted; > > bool ready; > > + bool fast_vmexit; > > struct kvm_vcpu_arch arch; > > struct dentry *debugfs_dentry; > > }; > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-14 15:22 ` Sean Christopherson @ 2019-11-14 15:44 ` Paolo Bonzini 2019-11-14 16:34 ` Sean Christopherson 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2019-11-14 15:44 UTC (permalink / raw) To: Sean Christopherson Cc: Wanpeng Li, linux-kernel, kvm, Radim Krčmář, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On 14/11/19 16:22, Sean Christopherson wrote: >> Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason >> to vmx->exit_reason to -1 if the fast path succeeds. > > Actually, rather than make this super special case, what about moving the > handling into vmx_handle_exit_irqoff()? Practically speaking that would > only add ~50 cycles (two VMREADs) relative to the code being run right > after kvm_put_guest_xcr0(). It has the advantage of restoring the host's > hardware breakpoints, preserving a semi-accurate last_guest_tsc, and > running with vcpu->mode set back to OUTSIDE_GUEST_MODE. Hopefully it'd > also be more intuitive for people unfamiliar with the code. Yes, that's a good idea. The expensive bit between handle_exit_irqoff and handle_exit is srcu_read_lock, which has two memory barriers in it. >>> + if (ret == 0) >>> + ret = kvm_skip_emulated_instruction(vcpu); >> Please move the "kvm_skip_emulated_instruction(vcpu)" to >> vmx_handle_exit, so that this basically is >> >> #define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1 >> >> if (ret == 0) >> vcpu->exit_reason = EXIT_REASON_NEED_SKIP_EMULATED_INSN; >> >> and handle_ipi_fastpath can return void. > > I'd rather we add a dedicated variable to say the exit has already been > handled. Overloading exit_reason is bound to cause confusion, and that's > probably a best case scenario. I proposed the fake exit reason to avoid a ternary return code from handle_ipi_fastpath (return to guest, return to userspace, call kvm_x86_ops->handle_exit), which Wanpeng's patch was mishandling. To ensure confusion does not become the best case scenario, perhaps it is worth trying to push exit_reason into vcpu_enter_guest's stack. vcpu_enter_guest can pass a pointer to it, and then it can be passed back into kvm_x86_ops->handle_exit{,_irqoff}. It could be a struct too, instead of just a bare u32. This would ensure at compile-time that exit_reason is not accessed outside the short path from vmexit to kvm_x86_ops->handle_exit. Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-14 15:44 ` Paolo Bonzini @ 2019-11-14 16:34 ` Sean Christopherson 2019-11-15 5:56 ` Wanpeng Li 0 siblings, 1 reply; 19+ messages in thread From: Sean Christopherson @ 2019-11-14 16:34 UTC (permalink / raw) To: Paolo Bonzini Cc: Wanpeng Li, linux-kernel, kvm, Radim Krčmář, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Thu, Nov 14, 2019 at 04:44:33PM +0100, Paolo Bonzini wrote: > On 14/11/19 16:22, Sean Christopherson wrote: > >> Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason > >> to vmx->exit_reason to -1 if the fast path succeeds. > > > > Actually, rather than make this super special case, what about moving the > > handling into vmx_handle_exit_irqoff()? Practically speaking that would > > only add ~50 cycles (two VMREADs) relative to the code being run right > > after kvm_put_guest_xcr0(). It has the advantage of restoring the host's > > hardware breakpoints, preserving a semi-accurate last_guest_tsc, and > > running with vcpu->mode set back to OUTSIDE_GUEST_MODE. Hopefully it'd > > also be more intuitive for people unfamiliar with the code. > > Yes, that's a good idea. The expensive bit between handle_exit_irqoff > and handle_exit is srcu_read_lock, which has two memory barriers in it. Preaching to the choir at this point, but it'd also eliminate latency spikes due to interrupts. > >>> + if (ret == 0) > >>> + ret = kvm_skip_emulated_instruction(vcpu); > >> Please move the "kvm_skip_emulated_instruction(vcpu)" to > >> vmx_handle_exit, so that this basically is > >> > >> #define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1 > >> > >> if (ret == 0) > >> vcpu->exit_reason = EXIT_REASON_NEED_SKIP_EMULATED_INSN; > >> > >> and handle_ipi_fastpath can return void. > > > > I'd rather we add a dedicated variable to say the exit has already been > > handled. Overloading exit_reason is bound to cause confusion, and that's > > probably a best case scenario. > > I proposed the fake exit reason to avoid a ternary return code from > handle_ipi_fastpath (return to guest, return to userspace, call > kvm_x86_ops->handle_exit), which Wanpeng's patch was mishandling. For this case, I think we can get away with a WARN if kvm_lapic_reg_write() fails since it (currently) can't fail for ICR. That would allow us to keep a void return for ->handle_exit_irqoff() and avoid an overloaded return value. And, if the fastpath is used for all ICR writes, not just FIXED+PHYSICAL, and is implemented for SVM, then we don't even need a a flag, e.g. kvm_x2apic_msr_write() can simply ignore ICR writes, similar to how handle_exception() ignores #MC and NMIs. diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 87b0fcc23ef8..d7b79f7faac1 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2615,12 +2615,11 @@ int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic)) return 1; - if (reg == APIC_ICR2) + + /* ICR2 writes are ignored and ICR writes are handled early. */ + if (reg == APIC_ICR2 || reg == APIC_ICR) return 1; - /* if this is ICR write vector before command */ - if (reg == APIC_ICR) - kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(data >> 32)); return kvm_lapic_reg_write(apic, reg, (u32)data); } Another bonus to this approach is that the refactoring for the exit_reason can be done in a separate series. > To ensure confusion does not become the best case scenario, perhaps it > is worth trying to push exit_reason into vcpu_enter_guest's stack. > vcpu_enter_guest can pass a pointer to it, and then it can be passed > back into kvm_x86_ops->handle_exit{,_irqoff}. It could be a struct too, > instead of just a bare u32. On the other hand, if it's a bare u32 then kvm_x86_ops->run can simply return the value instead of doing out parameter shenanigans. > This would ensure at compile-time that exit_reason is not accessed > outside the short path from vmexit to kvm_x86_ops->handle_exit. That would be nice. Surprisingly, the code actually appears to be fairly clean, I thought for sure the nested stuff would be using exit_reason all over the place. The only one that needs to be fixed is handle_vmfunc(), which passes vmx->exit_reason when forwarding the VM-Exit instead of simply hardcoding EXIT_REASON_VMFUNC. ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath 2019-11-14 16:34 ` Sean Christopherson @ 2019-11-15 5:56 ` Wanpeng Li 0 siblings, 0 replies; 19+ messages in thread From: Wanpeng Li @ 2019-11-15 5:56 UTC (permalink / raw) To: Paolo Bonzini, Sean Christopherson Cc: LKML, kvm, Radim Krčmář, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel On Fri, 15 Nov 2019 at 00:34, Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Thu, Nov 14, 2019 at 04:44:33PM +0100, Paolo Bonzini wrote: > > On 14/11/19 16:22, Sean Christopherson wrote: > > >> Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason > > >> to vmx->exit_reason to -1 if the fast path succeeds. > > > > > > Actually, rather than make this super special case, what about moving the > > > handling into vmx_handle_exit_irqoff()? Practically speaking that would > > > only add ~50 cycles (two VMREADs) relative to the code being run right > > > after kvm_put_guest_xcr0(). It has the advantage of restoring the host's > > > hardware breakpoints, preserving a semi-accurate last_guest_tsc, and > > > running with vcpu->mode set back to OUTSIDE_GUEST_MODE. Hopefully it'd > > > also be more intuitive for people unfamiliar with the code. > > > > Yes, that's a good idea. The expensive bit between handle_exit_irqoff > > and handle_exit is srcu_read_lock, which has two memory barriers in it. Moving the handling into vmx_handle_exit_irqoff() can worse ~100 cycles than right after kvm_put_guest_xcr0() in my testing. So, which one do you prefer? For moving the handling into vmx_handle_exit_irqoff(), how about the patch below: ---8<--- From 1b20b0a80c6da18b721f125cc40f8be5ad31f4b1 Mon Sep 17 00:00:00 2001 From: Wanpeng Li <wanpengli@tencent.com> Date: Fri, 15 Nov 2019 10:18:09 +0800 Subject: [PATCH v2] KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpath Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/include/uapi/asm/vmx.h | 1 + arch/x86/kvm/svm.c | 4 ++-- arch/x86/kvm/vmx/vmx.c | 40 +++++++++++++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 5 +++-- 5 files changed, 45 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 24d6598..3604f3a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1067,7 +1067,7 @@ struct kvm_x86_ops { void (*tlb_flush_gva)(struct kvm_vcpu *vcpu, gva_t addr); void (*run)(struct kvm_vcpu *vcpu); - int (*handle_exit)(struct kvm_vcpu *vcpu); + int (*handle_exit)(struct kvm_vcpu *vcpu, u32 *vcpu_exit_reason); int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask); u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu); @@ -1117,7 +1117,7 @@ struct kvm_x86_ops { int (*check_intercept)(struct kvm_vcpu *vcpu, struct x86_instruction_info *info, enum x86_intercept_stage stage); - void (*handle_exit_irqoff)(struct kvm_vcpu *vcpu); + void (*handle_exit_irqoff)(struct kvm_vcpu *vcpu, u32 *vcpu_exit_reason); bool (*mpx_supported)(void); bool (*xsaves_supported)(void); bool (*umip_emulated)(void); diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index 3eb8411..b33c6e1 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -88,6 +88,7 @@ #define EXIT_REASON_XRSTORS 64 #define EXIT_REASON_UMWAIT 67 #define EXIT_REASON_TPAUSE 68 +#define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1 #define VMX_EXIT_REASONS \ { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index c5673bd..3bb0661 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -4927,7 +4927,7 @@ static void svm_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2) *info2 = control->exit_info_2; } -static int handle_exit(struct kvm_vcpu *vcpu) +static int handle_exit(struct kvm_vcpu *vcpu, u32 *vcpu_exit_reason) { struct vcpu_svm *svm = to_svm(vcpu); struct kvm_run *kvm_run = vcpu->run; @@ -6171,7 +6171,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu, return ret; } -static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu) +static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu, u32 *vcpu_exit_reason) { } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5d21a4a..073fe1f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5838,7 +5838,7 @@ void dump_vmcs(void) * The guest has exited. See if we can fix it or if we need userspace * assistance. */ -static int vmx_handle_exit(struct kvm_vcpu *vcpu) +static int vmx_handle_exit(struct kvm_vcpu *vcpu, u32 *vcpu_exit_reason) { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 exit_reason = vmx->exit_reason; @@ -5924,7 +5924,10 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) } } - if (exit_reason < kvm_vmx_max_exit_handlers + if (*vcpu_exit_reason == EXIT_REASON_NEED_SKIP_EMULATED_INSN) { + kvm_skip_emulated_instruction(vcpu); + return 1; + } else if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu); else { @@ -6255,7 +6258,36 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) } STACK_FRAME_NON_STANDARD(handle_external_interrupt_irqoff); -static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) +static u32 handle_ipi_fastpath(struct kvm_vcpu *vcpu) +{ + u32 index; + u64 data; + int ret = 0; + + if (lapic_in_kernel(vcpu) && apic_x2apic_mode(vcpu->arch.apic)) { + /* + * fastpath to IPI target, FIXED+PHYSICAL which is popular + */ + index = kvm_rcx_read(vcpu); + data = kvm_read_edx_eax(vcpu); + + if (((index - APIC_BASE_MSR) << 4 == APIC_ICR) && + ((data & KVM_APIC_DEST_MASK) == APIC_DEST_PHYSICAL) && + ((data & APIC_MODE_MASK) == APIC_DM_FIXED)) { + + trace_kvm_msr_write(index, data); + kvm_lapic_set_reg(vcpu->arch.apic, APIC_ICR2, (u32)(data >> 32)); + ret = kvm_lapic_reg_write(vcpu->arch.apic, APIC_ICR, (u32)data); + + if (ret == 0) + return EXIT_REASON_NEED_SKIP_EMULATED_INSN; + } + } + + return ret; +} + +static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu, u32 *exit_reason) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6263,6 +6295,8 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) handle_external_interrupt_irqoff(vcpu); else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI) handle_exception_nmi_irqoff(vmx); + else if (vmx->exit_reason == EXIT_REASON_MSR_WRITE) + *exit_reason = handle_ipi_fastpath(vcpu); } static bool vmx_has_emulated_msr(int index) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ff395f8..130576b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7931,6 +7931,7 @@ EXPORT_SYMBOL_GPL(__kvm_request_immediate_exit); static int vcpu_enter_guest(struct kvm_vcpu *vcpu) { int r; + u32 exit_reason = 0; bool req_int_win = dm_request_for_irq_injection(vcpu) && kvm_cpu_accept_dm_intr(vcpu); @@ -8180,7 +8181,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) vcpu->mode = OUTSIDE_GUEST_MODE; smp_wmb(); - kvm_x86_ops->handle_exit_irqoff(vcpu); + kvm_x86_ops->handle_exit_irqoff(vcpu, &exit_reason); /* * Consume any pending interrupts, including the possible source of @@ -8224,7 +8225,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_lapic_sync_from_vapic(vcpu); vcpu->arch.gpa_available = false; - r = kvm_x86_ops->handle_exit(vcpu); + r = kvm_x86_ops->handle_exit(vcpu, &exit_reason); return r; cancel_injection: -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2019-11-15 5:56 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-09 7:05 [PATCH 1/2] KVM: X86: Single target IPI fastpath Wanpeng Li 2019-11-09 7:05 ` [PATCH 2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery Wanpeng Li 2019-11-11 21:59 ` Paolo Bonzini 2019-11-12 1:34 ` Wanpeng Li 2019-11-10 1:59 ` [PATCH 1/2] KVM: X86: Single target IPI fastpath kbuild test robot 2019-11-10 1:59 ` kbuild test robot 2019-11-10 1:59 ` [PATCH] KVM: X86: fix semicolon.cocci warnings kbuild test robot 2019-11-10 1:59 ` kbuild test robot 2019-11-11 13:06 ` [PATCH 1/2] KVM: X86: Single target IPI fastpath Vitaly Kuznetsov 2019-11-12 1:18 ` Wanpeng Li 2019-11-11 21:59 ` Paolo Bonzini 2019-11-12 1:33 ` Wanpeng Li 2019-11-13 6:05 ` Wanpeng Li 2019-11-14 3:12 ` Wanpeng Li 2019-11-14 11:58 ` Paolo Bonzini 2019-11-14 15:22 ` Sean Christopherson 2019-11-14 15:44 ` Paolo Bonzini 2019-11-14 16:34 ` Sean Christopherson 2019-11-15 5:56 ` Wanpeng Li
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.