* [patch 0/4] VMX: configure posted interrupt descriptor when assigning device @ 2021-05-07 13:06 Marcelo Tosatti 2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 13:06 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson Configuration of the posted interrupt descriptor is incorrect when devices are hotplugged to the guest (and vcpus are halted). See patch 4 for details. --- v2: rather than using a potentially racy IPI (vs vcpu->cpu switches), kick the vcpus when assigning a device and let the blocked per-CPU list manipulation happen locally at ->pre_block and ->post_block (Sean Christopherson). ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti @ 2021-05-07 13:06 ` Marcelo Tosatti 2021-05-07 19:16 ` Peter Xu 2021-05-07 13:06 ` [patch 2/4] KVM: add arch specific vcpu_check_block callback Marcelo Tosatti ` (2 subsequent siblings) 3 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 13:06 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Marcelo Tosatti Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done. The hook is required to update the wakeup vector of a sleeping vCPU when a device is assigned to the guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/include/asm/kvm_host.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); + void (*start_assignment)(struct kvm *kvm, int device_count); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); Index: kvm/arch/x86/kvm/svm/svm.c =================================================================== --- kvm.orig/arch/x86/kvm/svm/svm.c +++ kvm/arch/x86/kvm/svm/svm.c @@ -4601,6 +4601,7 @@ static struct kvm_x86_ops svm_x86_ops __ .deliver_posted_interrupt = svm_deliver_avic_intr, .dy_apicv_has_pending_interrupt = svm_dy_apicv_has_pending_interrupt, .update_pi_irte = svm_update_pi_irte, + .start_assignment = NULL, .setup_mce = svm_setup_mce, .smi_allowed = svm_smi_allowed, Index: kvm/arch/x86/kvm/vmx/vmx.c =================================================================== --- kvm.orig/arch/x86/kvm/vmx/vmx.c +++ kvm/arch/x86/kvm/vmx/vmx.c @@ -7732,6 +7732,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .nested_ops = &vmx_nested_ops, .update_pi_irte = pi_update_irte, + .start_assignment = NULL, #ifdef CONFIG_X86_64 .set_hv_timer = vmx_set_hv_timer, Index: kvm/arch/x86/kvm/x86.c =================================================================== --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -11295,7 +11295,10 @@ bool kvm_arch_can_dequeue_async_page_pre void kvm_arch_start_assignment(struct kvm *kvm) { - atomic_inc(&kvm->arch.assigned_device_count); + int ret; + + ret = atomic_inc_return(&kvm->arch.assigned_device_count); + static_call_cond(kvm_x86_start_assignment)(kvm, ret); } EXPORT_SYMBOL_GPL(kvm_arch_start_assignment); Index: kvm/arch/x86/include/asm/kvm-x86-ops.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm-x86-ops.h +++ kvm/arch/x86/include/asm/kvm-x86-ops.h @@ -99,6 +99,7 @@ KVM_X86_OP_NULL(post_block) KVM_X86_OP_NULL(vcpu_blocking) KVM_X86_OP_NULL(vcpu_unblocking) KVM_X86_OP_NULL(update_pi_irte) +KVM_X86_OP_NULL(start_assignment) KVM_X86_OP_NULL(apicv_post_state_restore) KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt) KVM_X86_OP_NULL(set_hv_timer) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti @ 2021-05-07 19:16 ` Peter Xu 2021-05-10 17:53 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-07 19:16 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Alex Williamson, Sean Christopherson On Fri, May 07, 2021 at 10:06:10AM -0300, Marcelo Tosatti wrote: > Add a start_assignment hook to kvm_x86_ops, which is called when > kvm_arch_start_assignment is done. > > The hook is required to update the wakeup vector of a sleeping vCPU > when a device is assigned to the guest. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > Index: kvm/arch/x86/include/asm/kvm_host.h > =================================================================== > --- kvm.orig/arch/x86/include/asm/kvm_host.h > +++ kvm/arch/x86/include/asm/kvm_host.h > @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { > > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > uint32_t guest_irq, bool set); > + void (*start_assignment)(struct kvm *kvm, int device_count); I'm thinking what the hook could do with the device_count besides comparing it against 1... If we can't think of any, perhaps we can directly make it an enablement hook instead (so we avoid calling the hook at all when count>1)? /* Called when the first assignment registers (count from 0 to 1) */ void (*enable_assignment)(struct kvm *kvm); -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-07 19:16 ` Peter Xu @ 2021-05-10 17:53 ` Marcelo Tosatti 0 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-10 17:53 UTC (permalink / raw) To: Peter Xu; +Cc: kvm, Paolo Bonzini, Alex Williamson, Sean Christopherson On Fri, May 07, 2021 at 03:16:00PM -0400, Peter Xu wrote: > On Fri, May 07, 2021 at 10:06:10AM -0300, Marcelo Tosatti wrote: > > Add a start_assignment hook to kvm_x86_ops, which is called when > > kvm_arch_start_assignment is done. > > > > The hook is required to update the wakeup vector of a sleeping vCPU > > when a device is assigned to the guest. > > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > > > Index: kvm/arch/x86/include/asm/kvm_host.h > > =================================================================== > > --- kvm.orig/arch/x86/include/asm/kvm_host.h > > +++ kvm/arch/x86/include/asm/kvm_host.h > > @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { > > > > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > > uint32_t guest_irq, bool set); > > + void (*start_assignment)(struct kvm *kvm, int device_count); > > I'm thinking what the hook could do with the device_count besides comparing it > against 1... > > If we can't think of any, perhaps we can directly make it an enablement hook > instead (so we avoid calling the hook at all when count>1)? > > /* Called when the first assignment registers (count from 0 to 1) */ > void (*enable_assignment)(struct kvm *kvm); Sure, sounds good, just kept the original name... ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 2/4] KVM: add arch specific vcpu_check_block callback 2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti 2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti @ 2021-05-07 13:06 ` Marcelo Tosatti 2021-05-07 13:06 ` [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback Marcelo Tosatti 2021-05-07 13:06 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti 3 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 13:06 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Marcelo Tosatti Add callback in kvm_vcpu_check_block, so that architectures can direct a vcpu to exit the vcpu block loop without requiring events that would unhalt it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/include/linux/kvm_host.h =================================================================== --- kvm.orig/include/linux/kvm_host.h +++ kvm/include/linux/kvm_host.h @@ -971,6 +971,13 @@ static inline int kvm_arch_flush_remote_ } #endif +#ifndef __KVM_HAVE_ARCH_VCPU_CHECK_BLOCK +static inline int kvm_arch_vcpu_check_block(struct kvm_vcpu *vcpu) +{ + return 0; +} +#endif + #ifdef __KVM_HAVE_ARCH_NONCOHERENT_DMA void kvm_arch_register_noncoherent_dma(struct kvm *kvm); void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm); Index: kvm/virt/kvm/kvm_main.c =================================================================== --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -2794,6 +2794,8 @@ static int kvm_vcpu_check_block(struct k goto out; if (signal_pending(current)) goto out; + if (kvm_arch_vcpu_check_block(vcpu)) + goto out; ret = 0; out: ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback 2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti 2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 2021-05-07 13:06 ` [patch 2/4] KVM: add arch specific vcpu_check_block callback Marcelo Tosatti @ 2021-05-07 13:06 ` Marcelo Tosatti 2021-05-07 13:06 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti 3 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 13:06 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Marcelo Tosatti Implement kvm_arch_vcpu_check_block for x86. Next patch will add implementation of kvm_x86_ops.vcpu_check_block for VMX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/include/asm/kvm_host.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -1320,6 +1320,8 @@ struct kvm_x86_ops { void (*vcpu_blocking)(struct kvm_vcpu *vcpu); void (*vcpu_unblocking)(struct kvm_vcpu *vcpu); + int (*vcpu_check_block)(struct kvm_vcpu *vcpu); + int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void (*start_assignment)(struct kvm *kvm, int device_count); @@ -1801,6 +1803,15 @@ static inline bool kvm_irq_is_postable(s irq->delivery_mode == APIC_DM_LOWEST); } +#define __KVM_HAVE_ARCH_VCPU_CHECK_BLOCK +static inline int kvm_arch_vcpu_check_block(struct kvm_vcpu *vcpu) +{ + if (kvm_x86_ops.vcpu_check_block) + return static_call(kvm_x86_vcpu_check_block)(vcpu); + + return 0; +} + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) { static_call_cond(kvm_x86_vcpu_blocking)(vcpu); Index: kvm/arch/x86/kvm/vmx/vmx.c =================================================================== --- kvm.orig/arch/x86/kvm/vmx/vmx.c +++ kvm/arch/x86/kvm/vmx/vmx.c @@ -7727,6 +7727,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .pre_block = vmx_pre_block, .post_block = vmx_post_block, + .vcpu_check_block = NULL, .pmu_ops = &intel_pmu_ops, .nested_ops = &vmx_nested_ops, Index: kvm/arch/x86/include/asm/kvm-x86-ops.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm-x86-ops.h +++ kvm/arch/x86/include/asm/kvm-x86-ops.h @@ -98,6 +98,7 @@ KVM_X86_OP_NULL(pre_block) KVM_X86_OP_NULL(post_block) KVM_X86_OP_NULL(vcpu_blocking) KVM_X86_OP_NULL(vcpu_unblocking) +KVM_X86_OP_NULL(vcpu_check_block) KVM_X86_OP_NULL(update_pi_irte) KVM_X86_OP_NULL(start_assignment) KVM_X86_OP_NULL(apicv_post_state_restore) Index: kvm/arch/x86/kvm/svm/svm.c =================================================================== --- kvm.orig/arch/x86/kvm/svm/svm.c +++ kvm/arch/x86/kvm/svm/svm.c @@ -4517,6 +4517,7 @@ static struct kvm_x86_ops svm_x86_ops __ .vcpu_put = svm_vcpu_put, .vcpu_blocking = svm_vcpu_blocking, .vcpu_unblocking = svm_vcpu_unblocking, + .vcpu_check_block = NULL, .update_exception_bitmap = svm_update_exception_bitmap, .get_msr_feature = svm_get_msr_feature, ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti ` (2 preceding siblings ...) 2021-05-07 13:06 ` [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback Marcelo Tosatti @ 2021-05-07 13:06 ` Marcelo Tosatti 2021-05-07 17:22 ` Sean Christopherson 3 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 13:06 UTC (permalink / raw) To: kvm Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Pei Zhang, Marcelo Tosatti For VMX, when a vcpu enters HLT emulation, pi_post_block will: 1) Add vcpu to per-cpu list of blocked vcpus. 2) Program the posted-interrupt descriptor "notification vector" to POSTED_INTR_WAKEUP_VECTOR With interrupt remapping, an interrupt will set the PIR bit for the vector programmed for the device on the CPU, test-and-set the ON bit on the posted interrupt descriptor, and if the ON bit is clear generate an interrupt for the notification vector. This way, the target CPU wakes upon a device interrupt and wakes up the target vcpu. Problem is that pi_post_block only programs the notification vector if kvm_arch_has_assigned_device() is true. Its possible for the following to happen: 1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false, notification vector is not programmed 2) device is assigned to VM 3) device interrupts vcpu V, sets ON bit (notification vector not programmed, so pcpu P remains in idle) 4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set, kvm_vcpu_kick is skipped 5) vcpu 0 busy spins on vcpu V's response for several seconds, until RCU watchdog NMIs all vCPUs. To fix this, use the start_assignment kvm_x86_ops callback to kick vcpus out of the halt loop, so the notification vector is properly reprogrammed to the wakeup vector. Reported-by: Pei Zhang <pezhang@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- v2: add vmx_pi_start_assignment to vmx's kvm_x86_ops Index: kvm/arch/x86/kvm/vmx/posted_intr.c =================================================================== --- kvm.orig/arch/x86/kvm/vmx/posted_intr.c +++ kvm/arch/x86/kvm/vmx/posted_intr.c @@ -203,6 +203,25 @@ void pi_post_block(struct kvm_vcpu *vcpu local_irq_enable(); } +int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) +{ + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); + + if (!irq_remapping_cap(IRQ_POSTING_CAP)) + return 0; + + if (!kvm_vcpu_apicv_active(vcpu)) + return 0; + + if (!kvm_arch_has_assigned_device(vcpu->kvm)) + return 0; + + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) + return 0; + + return 1; +} + /* * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. */ @@ -236,6 +255,26 @@ bool pi_has_pending_interrupt(struct kvm (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc)); } +void vmx_pi_start_assignment(struct kvm *kvm, int device_count) +{ + struct kvm_vcpu *vcpu; + int i; + + if (!irq_remapping_cap(IRQ_POSTING_CAP)) + return; + + /* only care about first device assignment */ + if (device_count != 1) + return; + + /* Update wakeup vector and add vcpu to blocked_vcpu_list */ + kvm_for_each_vcpu(i, vcpu, kvm) { + if (!kvm_vcpu_apicv_active(vcpu)) + continue; + + kvm_vcpu_kick(vcpu); + } +} /* * pi_update_irte - set IRTE for Posted-Interrupts Index: kvm/arch/x86/kvm/vmx/posted_intr.h =================================================================== --- kvm.orig/arch/x86/kvm/vmx/posted_intr.h +++ kvm/arch/x86/kvm/vmx/posted_intr.h @@ -95,5 +95,7 @@ void __init pi_init_cpu(int cpu); bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu); int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); +void vmx_pi_start_assignment(struct kvm *kvm, int device_count); +int vmx_vcpu_check_block(struct kvm_vcpu *vcpu); #endif /* __KVM_X86_VMX_POSTED_INTR_H */ Index: kvm/arch/x86/kvm/vmx/vmx.c =================================================================== --- kvm.orig/arch/x86/kvm/vmx/vmx.c +++ kvm/arch/x86/kvm/vmx/vmx.c @@ -7727,13 +7727,13 @@ static struct kvm_x86_ops vmx_x86_ops __ .pre_block = vmx_pre_block, .post_block = vmx_post_block, - .vcpu_check_block = NULL, + .vcpu_check_block = vmx_vcpu_check_block, .pmu_ops = &intel_pmu_ops, .nested_ops = &vmx_nested_ops, .update_pi_irte = pi_update_irte, - .start_assignment = NULL, + .start_assignment = vmx_pi_start_assignment, #ifdef CONFIG_X86_64 .set_hv_timer = vmx_set_hv_timer, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-07 13:06 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti @ 2021-05-07 17:22 ` Sean Christopherson 2021-05-07 19:29 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Sean Christopherson @ 2021-05-07 17:22 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Fri, May 07, 2021, Marcelo Tosatti wrote: > Index: kvm/arch/x86/kvm/vmx/posted_intr.c > =================================================================== > --- kvm.orig/arch/x86/kvm/vmx/posted_intr.c > +++ kvm/arch/x86/kvm/vmx/posted_intr.c > @@ -203,6 +203,25 @@ void pi_post_block(struct kvm_vcpu *vcpu > local_irq_enable(); > } > > +int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) > +{ > + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); > + > + if (!irq_remapping_cap(IRQ_POSTING_CAP)) > + return 0; > + > + if (!kvm_vcpu_apicv_active(vcpu)) > + return 0; > + > + if (!kvm_arch_has_assigned_device(vcpu->kvm)) > + return 0; > + > + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) > + return 0; > + > + return 1; IIUC, the logic is to bail out of the block loop if the VM has an assigned device, but the blocking vCPU didn't reconfigure the PI.NV to the wakeup vector, i.e. the assigned device came along after the initial check in vcpu_block(). That makes sense, but you can add a comment somewhere in/above this function? > +} > + > /* > * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. > */ > @@ -236,6 +255,26 @@ bool pi_has_pending_interrupt(struct kvm > (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc)); > } > > +void vmx_pi_start_assignment(struct kvm *kvm, int device_count) > +{ > + struct kvm_vcpu *vcpu; > + int i; > + > + if (!irq_remapping_cap(IRQ_POSTING_CAP)) > + return; > + > + /* only care about first device assignment */ > + if (device_count != 1) > + return; > + > + /* Update wakeup vector and add vcpu to blocked_vcpu_list */ Can you expand this comment, too? Specifically, I think what you're saying is that the wakeup will cause the vCPU to bail out of kvm_vcpu_block() and go back through vcpu_block() and thus pi_pre_block(). > + kvm_for_each_vcpu(i, vcpu, kvm) { > + if (!kvm_vcpu_apicv_active(vcpu)) > + continue; > + > + kvm_vcpu_kick(vcpu); Actually, can't we avoid the full kick and instead just do kvm_vcpu_wake_up()? If the vCPU is in guest mode, i.e. kvm_arch_vcpu_should_kick() returns true, then by definition it can't be blocking. And if it about to block, it's guaranteed to see the assigned device. > + } > +} ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-07 17:22 ` Sean Christopherson @ 2021-05-07 19:29 ` Peter Xu 2021-05-07 22:08 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-07 19:29 UTC (permalink / raw) To: Sean Christopherson Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Fri, May 07, 2021 at 05:22:07PM +0000, Sean Christopherson wrote: > On Fri, May 07, 2021, Marcelo Tosatti wrote: > > Index: kvm/arch/x86/kvm/vmx/posted_intr.c > > =================================================================== > > --- kvm.orig/arch/x86/kvm/vmx/posted_intr.c > > +++ kvm/arch/x86/kvm/vmx/posted_intr.c > > @@ -203,6 +203,25 @@ void pi_post_block(struct kvm_vcpu *vcpu > > local_irq_enable(); > > } > > > > +int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) > > +{ > > + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); > > + > > + if (!irq_remapping_cap(IRQ_POSTING_CAP)) > > + return 0; > > + > > + if (!kvm_vcpu_apicv_active(vcpu)) > > + return 0; > > + > > + if (!kvm_arch_has_assigned_device(vcpu->kvm)) > > + return 0; > > + > > + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) > > + return 0; > > + > > + return 1; > > IIUC, the logic is to bail out of the block loop if the VM has an assigned > device, but the blocking vCPU didn't reconfigure the PI.NV to the wakeup vector, > i.e. the assigned device came along after the initial check in vcpu_block(). > That makes sense, but you can add a comment somewhere in/above this function? Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() somehow, so that even without customized ->vcpu_check_block we should be able to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-07 19:29 ` Peter Xu @ 2021-05-07 22:08 ` Marcelo Tosatti 2021-05-11 14:39 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-07 22:08 UTC (permalink / raw) To: Peter Xu Cc: Sean Christopherson, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Fri, May 07, 2021 at 03:29:05PM -0400, Peter Xu wrote: > On Fri, May 07, 2021 at 05:22:07PM +0000, Sean Christopherson wrote: > > On Fri, May 07, 2021, Marcelo Tosatti wrote: > > > Index: kvm/arch/x86/kvm/vmx/posted_intr.c > > > =================================================================== > > > --- kvm.orig/arch/x86/kvm/vmx/posted_intr.c > > > +++ kvm/arch/x86/kvm/vmx/posted_intr.c > > > @@ -203,6 +203,25 @@ void pi_post_block(struct kvm_vcpu *vcpu > > > local_irq_enable(); > > > } > > > > > > +int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) > > > +{ > > > + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); > > > + > > > + if (!irq_remapping_cap(IRQ_POSTING_CAP)) > > > + return 0; > > > + > > > + if (!kvm_vcpu_apicv_active(vcpu)) > > > + return 0; > > > + > > > + if (!kvm_arch_has_assigned_device(vcpu->kvm)) > > > + return 0; > > > + > > > + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) > > > + return 0; > > > + > > > + return 1; > > > > IIUC, the logic is to bail out of the block loop if the VM has an assigned > > device, but the blocking vCPU didn't reconfigure the PI.NV to the wakeup vector, > > i.e. the assigned device came along after the initial check in vcpu_block(). > > That makes sense, but you can add a comment somewhere in/above this function? > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > somehow, so that even without customized ->vcpu_check_block we should be able > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) { int ret = -EINTR; int idx = srcu_read_lock(&vcpu->kvm->srcu); if (kvm_arch_vcpu_runnable(vcpu)) { kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- goto out; } Don't want to unhalt the vcpu. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-07 22:08 ` Marcelo Tosatti @ 2021-05-11 14:39 ` Peter Xu 2021-05-11 14:51 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-11 14:39 UTC (permalink / raw) To: Marcelo Tosatti Cc: Sean Christopherson, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > somehow, so that even without customized ->vcpu_check_block we should be able > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > { > int ret = -EINTR; > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > if (kvm_arch_vcpu_runnable(vcpu)) { > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > goto out; > } > > Don't want to unhalt the vcpu. Could you elaborate? It's not obvious to me why we can't do that if pi_test_on() returns true.. we have pending post interrupts anyways, so shouldn't we stop halting? Thanks! -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 14:39 ` Peter Xu @ 2021-05-11 14:51 ` Marcelo Tosatti 2021-05-11 16:19 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 14:51 UTC (permalink / raw) To: Peter Xu Cc: Sean Christopherson, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > { > > int ret = -EINTR; > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > goto out; > > } > > > > Don't want to unhalt the vcpu. > > Could you elaborate? It's not obvious to me why we can't do that if > pi_test_on() returns true.. we have pending post interrupts anyways, so > shouldn't we stop halting? Thanks! pi_test_on() only returns true when an interrupt is signalled by the device. But the sequence of events is: 1. pCPU idles without notification vector configured to wakeup vector. 2. PCI device is hotplugged, assigned device count increases from 0 to 1. <arbitrary amount of time> 3. device generates interrupt, sets ON bit to true in the posted interrupt descriptor. We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit is not set). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 14:51 ` Marcelo Tosatti @ 2021-05-11 16:19 ` Peter Xu 2021-05-11 17:18 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-11 16:19 UTC (permalink / raw) To: Marcelo Tosatti Cc: Sean Christopherson, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > { > > > int ret = -EINTR; > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > goto out; > > > } > > > > > > Don't want to unhalt the vcpu. > > > > Could you elaborate? It's not obvious to me why we can't do that if > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > shouldn't we stop halting? Thanks! > > pi_test_on() only returns true when an interrupt is signalled by the > device. But the sequence of events is: > > > 1. pCPU idles without notification vector configured to wakeup vector. > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > <arbitrary amount of time> > > 3. device generates interrupt, sets ON bit to true in the posted > interrupt descriptor. > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > is not set). Ah yes.. thanks. Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) We can set it in vmx_pi_start_assignment(), then check+clear it in kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). The thing is current vmx_vcpu_check_block() is mostly a sanity check and copy-paste of the pi checks on a few items, so maybe cleaner to use KVM_REQ_UNBLOCK, as it might be reused in the future for re-evaluating of pre-block for similar purpose? No strong opinion, though. -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 16:19 ` Peter Xu @ 2021-05-11 17:18 ` Marcelo Tosatti 2021-05-11 21:35 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 17:18 UTC (permalink / raw) To: Peter Xu, Paolo Bonzini Cc: Sean Christopherson, kvm, Paolo Bonzini, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > { > > > > int ret = -EINTR; > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > goto out; > > > > } > > > > > > > > Don't want to unhalt the vcpu. > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > shouldn't we stop halting? Thanks! > > > > pi_test_on() only returns true when an interrupt is signalled by the > > device. But the sequence of events is: > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > <arbitrary amount of time> > > > > 3. device generates interrupt, sets ON bit to true in the posted > > interrupt descriptor. > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > is not set). > > Ah yes.. thanks. > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > We can set it in vmx_pi_start_assignment(), then check+clear it in > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). Can't check it in kvm_vcpu_has_events() because that will set KVM_REQ_UNHALT (which we don't want). I think KVM_REQ_UNBLOCK will add more lines of code. > The thing is current vmx_vcpu_check_block() is mostly a sanity check and > copy-paste of the pi checks on a few items, so maybe cleaner to use > KVM_REQ_UNBLOCK, as it might be reused in the future for re-evaluating of > pre-block for similar purpose? > > No strong opinion, though. Hum... IMHO v3 is quite clean already (although i don't object to your suggestion). Paolo, what do you think? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 17:18 ` Marcelo Tosatti @ 2021-05-11 21:35 ` Peter Xu 2021-05-11 23:51 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-11 21:35 UTC (permalink / raw) To: Marcelo Tosatti Cc: Paolo Bonzini, Sean Christopherson, kvm, Alex Williamson, Pei Zhang [-- Attachment #1: Type: text/plain, Size: 3516 bytes --] On Tue, May 11, 2021 at 02:18:10PM -0300, Marcelo Tosatti wrote: > On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > > { > > > > > int ret = -EINTR; > > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > > goto out; > > > > > } > > > > > > > > > > Don't want to unhalt the vcpu. > > > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > > shouldn't we stop halting? Thanks! > > > > > > pi_test_on() only returns true when an interrupt is signalled by the > > > device. But the sequence of events is: > > > > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > > > <arbitrary amount of time> > > > > > > 3. device generates interrupt, sets ON bit to true in the posted > > > interrupt descriptor. > > > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > > is not set). > > > > Ah yes.. thanks. > > > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > > > We can set it in vmx_pi_start_assignment(), then check+clear it in > > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). > > Can't check it in kvm_vcpu_has_events() because that will set > KVM_REQ_UNHALT (which we don't want). I thought it was okay to break the guest HLT? As IMHO the guest code should always be able to re-run the HLT when interrupted? As IIUC HLT can easily be interrupted by e.g., SMIs, according to SDM Vol.2. Not to mention vfio hotplug should be rare, and we'll only trigger this once for the 1st device. > > I think KVM_REQ_UNBLOCK will add more lines of code. It's very possible I overlooked something above... but if breaking HLT unregularly is okay, I attached one patch that is based on your v3 series, just dropped the vcpu_check_block() but use KVM_REQ_UNBLOCK (no compile test even, just to satisfy my own curiosity on how many loc we can save.. :), it gives me: 7 files changed, 5 insertions(+), 41 deletions(-) But again, I could have missed something... Thanks, > > > The thing is current vmx_vcpu_check_block() is mostly a sanity check and > > copy-paste of the pi checks on a few items, so maybe cleaner to use > > KVM_REQ_UNBLOCK, as it might be reused in the future for re-evaluating of > > pre-block for similar purpose? > > > > No strong opinion, though. > > Hum... IMHO v3 is quite clean already (although i don't object to your > suggestion). > > Paolo, what do you think? > > > -- Peter Xu [-- Attachment #2: 0001-replace-vcpu_check_block-hook-with-KVM_REQ_UNBLOCK.patch --] [-- Type: text/plain, Size: 5567 bytes --] From 1131248f3c8f1f2715dd49d439c9fab25b4db9b8 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Tue, 11 May 2021 17:33:21 -0400 Subject: [PATCH] replace vcpu_check_block() hook with KVM_REQ_UNBLOCK Signed-off-by: Peter Xu <peterx@redhat.com> --- arch/x86/include/asm/kvm-x86-ops.h | 1 - arch/x86/include/asm/kvm_host.h | 12 +----------- arch/x86/kvm/svm/svm.c | 1 - arch/x86/kvm/vmx/posted_intr.c | 27 +-------------------------- arch/x86/kvm/vmx/posted_intr.h | 1 - arch/x86/kvm/vmx/vmx.c | 1 - arch/x86/kvm/x86.c | 3 +++ 7 files changed, 5 insertions(+), 41 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index fc99fb779fd21..e7bef91cee04a 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -98,7 +98,6 @@ KVM_X86_OP_NULL(pre_block) KVM_X86_OP_NULL(post_block) KVM_X86_OP_NULL(vcpu_blocking) KVM_X86_OP_NULL(vcpu_unblocking) -KVM_X86_OP_NULL(vcpu_check_block) KVM_X86_OP_NULL(update_pi_irte) KVM_X86_OP_NULL(start_assignment) KVM_X86_OP_NULL(apicv_post_state_restore) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5bf7bd0e59582..74ab042e9b146 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -91,6 +91,7 @@ #define KVM_REQ_MSR_FILTER_CHANGED KVM_ARCH_REQ(29) #define KVM_REQ_UPDATE_CPU_DIRTY_LOGGING \ KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -1350,8 +1351,6 @@ struct kvm_x86_ops { void (*vcpu_blocking)(struct kvm_vcpu *vcpu); void (*vcpu_unblocking)(struct kvm_vcpu *vcpu); - int (*vcpu_check_block)(struct kvm_vcpu *vcpu); - int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void (*start_assignment)(struct kvm *kvm); @@ -1835,15 +1834,6 @@ static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq) irq->delivery_mode == APIC_DM_LOWEST); } -#define __KVM_HAVE_ARCH_VCPU_CHECK_BLOCK -static inline int kvm_arch_vcpu_check_block(struct kvm_vcpu *vcpu) -{ - if (kvm_x86_ops.vcpu_check_block) - return static_call(kvm_x86_vcpu_check_block)(vcpu); - - return 0; -} - static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) { static_call_cond(kvm_x86_vcpu_blocking)(vcpu); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index cda5ccb4d9d1b..8b03795cfcd11 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4459,7 +4459,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .vcpu_put = svm_vcpu_put, .vcpu_blocking = svm_vcpu_blocking, .vcpu_unblocking = svm_vcpu_unblocking, - .vcpu_check_block = NULL, .update_exception_bitmap = svm_update_exception_bitmap, .get_msr_feature = svm_get_msr_feature, diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index 2d0d009965530..0b74d598ebcbd 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -203,32 +203,6 @@ void pi_post_block(struct kvm_vcpu *vcpu) local_irq_enable(); } -/* - * Bail out of the block loop if the VM has an assigned - * device, but the blocking vCPU didn't reconfigure the - * PI.NV to the wakeup vector, i.e. the assigned device - * came along after the initial check in vcpu_block(). - */ - -int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) -{ - struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - - if (!irq_remapping_cap(IRQ_POSTING_CAP)) - return 0; - - if (!kvm_vcpu_apicv_active(vcpu)) - return 0; - - if (!kvm_arch_has_assigned_device(vcpu->kvm)) - return 0; - - if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) - return 0; - - return 1; -} - /* * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. */ @@ -278,6 +252,7 @@ void vmx_pi_start_assignment(struct kvm *kvm) if (!kvm_vcpu_apicv_active(vcpu)) continue; + kvm_make_request(KVM_REQ_UNBLOCK, vcpu); kvm_vcpu_wake_up(vcpu); } } diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 2aa082fd1c7ab..7f7b2326caf53 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -96,6 +96,5 @@ bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu); int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void vmx_pi_start_assignment(struct kvm *kvm); -int vmx_vcpu_check_block(struct kvm_vcpu *vcpu); #endif /* __KVM_X86_VMX_POSTED_INTR_H */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ab68fed8b7e43..639ec3eba9b80 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7716,7 +7716,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .pre_block = vmx_pre_block, .post_block = vmx_post_block, - .vcpu_check_block = vmx_vcpu_check_block, .pmu_ops = &intel_pmu_ops, .nested_ops = &vmx_nested_ops, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e6fee59b5dab6..739e1bd59e8a9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11177,6 +11177,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) static_call(kvm_x86_smi_allowed)(vcpu, false))) return true; + if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) + return true; + if (kvm_arch_interrupt_allowed(vcpu) && (kvm_cpu_has_interrupt(vcpu) || kvm_guest_apic_has_interrupt(vcpu))) -- 2.31.1 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 21:35 ` Peter Xu @ 2021-05-11 23:51 ` Marcelo Tosatti 2021-05-12 0:02 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 23:51 UTC (permalink / raw) To: Peter Xu Cc: Paolo Bonzini, Sean Christopherson, kvm, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 05:35:41PM -0400, Peter Xu wrote: > On Tue, May 11, 2021 at 02:18:10PM -0300, Marcelo Tosatti wrote: > > On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > > > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > > > { > > > > > > int ret = -EINTR; > > > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > > > goto out; > > > > > > } > > > > > > > > > > > > Don't want to unhalt the vcpu. > > > > > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > > > shouldn't we stop halting? Thanks! > > > > > > > > pi_test_on() only returns true when an interrupt is signalled by the > > > > device. But the sequence of events is: > > > > > > > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > > > > > <arbitrary amount of time> > > > > > > > > 3. device generates interrupt, sets ON bit to true in the posted > > > > interrupt descriptor. > > > > > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > > > is not set). > > > > > > Ah yes.. thanks. > > > > > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > > > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > > > > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > > > > > We can set it in vmx_pi_start_assignment(), then check+clear it in > > > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). > > > > Can't check it in kvm_vcpu_has_events() because that will set > > KVM_REQ_UNHALT (which we don't want). > > I thought it was okay to break the guest HLT? Intel: "HLT-HALT Description Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer (CS:EIP) points to the instruction following the HLT instruction." AMD: "6.5 Processor Halt The processor halt instruction (HLT) halts instruction execution, leaving the processor in the halt state. No registers or machine state are modified as a result of executing the HLT instruction. The processor remains in the halt state until one of the following occurs: • A non-maskable interrupt (NMI). • An enabled, maskable interrupt (INTR). • Processor reset (RESET). • Processor initialization (INIT). • System-management interrupt (SMI)." The KVM_REQ_UNBLOCK patch will resume execution even any such event occuring. So the behaviour would be different from baremetal. > As IMHO the guest code should > always be able to re-run the HLT when interrupted? As IIUC HLT can easily be > interrupted by e.g., SMIs, according to SDM Vol.2. CPU will by default return to HLT'ed state, not continue to the instruction following HLT, on SMI: 34.10 AUTO HALT RESTART If the processor is in a HALT state (due to the prior execution of a HLT instruction) when it receives an SMI, the processor records the fact in the auto HALT restart flag in the saved processor state (see Figure 34-3). (This flag is located at offset 7F02H and bit 0 in the state save area of the SMRAM.) If the processor sets the auto HALT restart flag upon entering SMM (indicating that the SMI occurred when the processor was in the HALT state), the SMI handler has two options: * It can leave the auto HALT restart flag set, which instructs the RSM instruction to return program control to the HLT instruction. This option in effect causes the processor to re-enter the HALT state after handling the SMI. (This is the default operation.) * It can clear the auto HALT restart flag, which instructs the RSM instruction to return program control to the instruction following the HLT instruction. > Not to mention vfio hotplug > should be rare, and we'll only trigger this once for the 1st device. > > > > > I think KVM_REQ_UNBLOCK will add more lines of code. > > It's very possible I overlooked something above... but if breaking HLT > unregularly is okay, I attached one patch that is based on your v3 series, just > dropped the vcpu_check_block() but use KVM_REQ_UNBLOCK (no compile test even, > just to satisfy my own curiosity on how many loc we can save.. :), it gives me: > > 7 files changed, 5 insertions(+), 41 deletions(-) > > But again, I could have missed something... > > Thanks, > > > > > > The thing is current vmx_vcpu_check_block() is mostly a sanity check and > > > copy-paste of the pi checks on a few items, so maybe cleaner to use > > > KVM_REQ_UNBLOCK, as it might be reused in the future for re-evaluating of > > > pre-block for similar purpose? > > > > > > No strong opinion, though. > > > > Hum... IMHO v3 is quite clean already (although i don't object to your > > suggestion). > > > > Paolo, what do you think? > > > > > > > > -- > Peter Xu > >From 1131248f3c8f1f2715dd49d439c9fab25b4db9b8 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Tue, 11 May 2021 17:33:21 -0400 > Subject: [PATCH] replace vcpu_check_block() hook with KVM_REQ_UNBLOCK > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > arch/x86/include/asm/kvm-x86-ops.h | 1 - > arch/x86/include/asm/kvm_host.h | 12 +----------- > arch/x86/kvm/svm/svm.c | 1 - > arch/x86/kvm/vmx/posted_intr.c | 27 +-------------------------- > arch/x86/kvm/vmx/posted_intr.h | 1 - > arch/x86/kvm/vmx/vmx.c | 1 - > arch/x86/kvm/x86.c | 3 +++ > 7 files changed, 5 insertions(+), 41 deletions(-) > > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h > index fc99fb779fd21..e7bef91cee04a 100644 > --- a/arch/x86/include/asm/kvm-x86-ops.h > +++ b/arch/x86/include/asm/kvm-x86-ops.h > @@ -98,7 +98,6 @@ KVM_X86_OP_NULL(pre_block) > KVM_X86_OP_NULL(post_block) > KVM_X86_OP_NULL(vcpu_blocking) > KVM_X86_OP_NULL(vcpu_unblocking) > -KVM_X86_OP_NULL(vcpu_check_block) > KVM_X86_OP_NULL(update_pi_irte) > KVM_X86_OP_NULL(start_assignment) > KVM_X86_OP_NULL(apicv_post_state_restore) > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 5bf7bd0e59582..74ab042e9b146 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -91,6 +91,7 @@ > #define KVM_REQ_MSR_FILTER_CHANGED KVM_ARCH_REQ(29) > #define KVM_REQ_UPDATE_CPU_DIRTY_LOGGING \ > KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) > +#define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > #define CR0_RESERVED_BITS \ > (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ > @@ -1350,8 +1351,6 @@ struct kvm_x86_ops { > void (*vcpu_blocking)(struct kvm_vcpu *vcpu); > void (*vcpu_unblocking)(struct kvm_vcpu *vcpu); > > - int (*vcpu_check_block)(struct kvm_vcpu *vcpu); > - > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > uint32_t guest_irq, bool set); > void (*start_assignment)(struct kvm *kvm); > @@ -1835,15 +1834,6 @@ static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq) > irq->delivery_mode == APIC_DM_LOWEST); > } > > -#define __KVM_HAVE_ARCH_VCPU_CHECK_BLOCK > -static inline int kvm_arch_vcpu_check_block(struct kvm_vcpu *vcpu) > -{ > - if (kvm_x86_ops.vcpu_check_block) > - return static_call(kvm_x86_vcpu_check_block)(vcpu); > - > - return 0; > -} > - > static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) > { > static_call_cond(kvm_x86_vcpu_blocking)(vcpu); > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index cda5ccb4d9d1b..8b03795cfcd11 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -4459,7 +4459,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { > .vcpu_put = svm_vcpu_put, > .vcpu_blocking = svm_vcpu_blocking, > .vcpu_unblocking = svm_vcpu_unblocking, > - .vcpu_check_block = NULL, > > .update_exception_bitmap = svm_update_exception_bitmap, > .get_msr_feature = svm_get_msr_feature, > diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c > index 2d0d009965530..0b74d598ebcbd 100644 > --- a/arch/x86/kvm/vmx/posted_intr.c > +++ b/arch/x86/kvm/vmx/posted_intr.c > @@ -203,32 +203,6 @@ void pi_post_block(struct kvm_vcpu *vcpu) > local_irq_enable(); > } > > -/* > - * Bail out of the block loop if the VM has an assigned > - * device, but the blocking vCPU didn't reconfigure the > - * PI.NV to the wakeup vector, i.e. the assigned device > - * came along after the initial check in vcpu_block(). > - */ > - > -int vmx_vcpu_check_block(struct kvm_vcpu *vcpu) > -{ > - struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); > - > - if (!irq_remapping_cap(IRQ_POSTING_CAP)) > - return 0; > - > - if (!kvm_vcpu_apicv_active(vcpu)) > - return 0; > - > - if (!kvm_arch_has_assigned_device(vcpu->kvm)) > - return 0; > - > - if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) > - return 0; > - > - return 1; > -} > - > /* > * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. > */ > @@ -278,6 +252,7 @@ void vmx_pi_start_assignment(struct kvm *kvm) > if (!kvm_vcpu_apicv_active(vcpu)) > continue; > > + kvm_make_request(KVM_REQ_UNBLOCK, vcpu); > kvm_vcpu_wake_up(vcpu); > } > } > diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h > index 2aa082fd1c7ab..7f7b2326caf53 100644 > --- a/arch/x86/kvm/vmx/posted_intr.h > +++ b/arch/x86/kvm/vmx/posted_intr.h > @@ -96,6 +96,5 @@ bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu); > int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, > bool set); > void vmx_pi_start_assignment(struct kvm *kvm); > -int vmx_vcpu_check_block(struct kvm_vcpu *vcpu); > > #endif /* __KVM_X86_VMX_POSTED_INTR_H */ > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index ab68fed8b7e43..639ec3eba9b80 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7716,7 +7716,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { > > .pre_block = vmx_pre_block, > .post_block = vmx_post_block, > - .vcpu_check_block = vmx_vcpu_check_block, > > .pmu_ops = &intel_pmu_ops, > .nested_ops = &vmx_nested_ops, > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e6fee59b5dab6..739e1bd59e8a9 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11177,6 +11177,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) > static_call(kvm_x86_smi_allowed)(vcpu, false))) > return true; > > + if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) > + return true; > + > if (kvm_arch_interrupt_allowed(vcpu) && > (kvm_cpu_has_interrupt(vcpu) || > kvm_guest_apic_has_interrupt(vcpu))) > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-11 23:51 ` Marcelo Tosatti @ 2021-05-12 0:02 ` Marcelo Tosatti 2021-05-12 0:38 ` Peter Xu 2021-05-12 14:41 ` Sean Christopherson 0 siblings, 2 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-12 0:02 UTC (permalink / raw) To: Peter Xu Cc: Paolo Bonzini, Sean Christopherson, kvm, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 08:51:24PM -0300, Marcelo Tosatti wrote: > On Tue, May 11, 2021 at 05:35:41PM -0400, Peter Xu wrote: > > On Tue, May 11, 2021 at 02:18:10PM -0300, Marcelo Tosatti wrote: > > > On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > > > > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > > > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > > > > { > > > > > > > int ret = -EINTR; > > > > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > > > > goto out; > > > > > > > } > > > > > > > > > > > > > > Don't want to unhalt the vcpu. > > > > > > > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > > > > shouldn't we stop halting? Thanks! > > > > > > > > > > pi_test_on() only returns true when an interrupt is signalled by the > > > > > device. But the sequence of events is: > > > > > > > > > > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > > > > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > > > > > > > <arbitrary amount of time> > > > > > > > > > > 3. device generates interrupt, sets ON bit to true in the posted > > > > > interrupt descriptor. > > > > > > > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > > > > is not set). > > > > > > > > Ah yes.. thanks. > > > > > > > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > > > > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > > > > > > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > > > > > > > We can set it in vmx_pi_start_assignment(), then check+clear it in > > > > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). > > > > > > Can't check it in kvm_vcpu_has_events() because that will set > > > KVM_REQ_UNHALT (which we don't want). > > > > I thought it was okay to break the guest HLT? > > Intel: > > "HLT-HALT > > Description > > Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and > SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an > interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer > (CS:EIP) points to the instruction following the HLT instruction." > > AMD: > > "6.5 Processor Halt > The processor halt instruction (HLT) halts instruction execution, leaving the processor in the halt state. > No registers or machine state are modified as a result of executing the HLT instruction. The processor > remains in the halt state until one of the following occurs: > • A non-maskable interrupt (NMI). > • An enabled, maskable interrupt (INTR). > • Processor reset (RESET). > • Processor initialization (INIT). > • System-management interrupt (SMI)." > > The KVM_REQ_UNBLOCK patch will resume execution even any such event even without any such event > occuring. So the behaviour would be different from baremetal. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-12 0:02 ` Marcelo Tosatti @ 2021-05-12 0:38 ` Peter Xu 2021-05-12 11:10 ` Marcelo Tosatti 2021-05-12 14:41 ` Sean Christopherson 1 sibling, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-12 0:38 UTC (permalink / raw) To: Marcelo Tosatti Cc: Paolo Bonzini, Sean Christopherson, kvm, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 09:02:59PM -0300, Marcelo Tosatti wrote: > On Tue, May 11, 2021 at 08:51:24PM -0300, Marcelo Tosatti wrote: > > On Tue, May 11, 2021 at 05:35:41PM -0400, Peter Xu wrote: > > > On Tue, May 11, 2021 at 02:18:10PM -0300, Marcelo Tosatti wrote: > > > > On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > > > > > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > > > > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > > > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > > > > > { > > > > > > > > int ret = -EINTR; > > > > > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > > > > > goto out; > > > > > > > > } > > > > > > > > > > > > > > > > Don't want to unhalt the vcpu. > > > > > > > > > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > > > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > > > > > shouldn't we stop halting? Thanks! > > > > > > > > > > > > pi_test_on() only returns true when an interrupt is signalled by the > > > > > > device. But the sequence of events is: > > > > > > > > > > > > > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > > > > > > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > > > > > > > > > <arbitrary amount of time> > > > > > > > > > > > > 3. device generates interrupt, sets ON bit to true in the posted > > > > > > interrupt descriptor. > > > > > > > > > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > > > > > is not set). > > > > > > > > > > Ah yes.. thanks. > > > > > > > > > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > > > > > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > > > > > > > > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > > > > > > > > > We can set it in vmx_pi_start_assignment(), then check+clear it in > > > > > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). > > > > > > > > Can't check it in kvm_vcpu_has_events() because that will set > > > > KVM_REQ_UNHALT (which we don't want). > > > > > > I thought it was okay to break the guest HLT? > > > > Intel: > > > > "HLT-HALT > > > > Description > > > > Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and > > SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an > > interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer > > (CS:EIP) points to the instruction following the HLT instruction." > > > > AMD: > > > > "6.5 Processor Halt > > The processor halt instruction (HLT) halts instruction execution, leaving the processor in the halt state. > > No registers or machine state are modified as a result of executing the HLT instruction. The processor > > remains in the halt state until one of the following occurs: > > • A non-maskable interrupt (NMI). > > • An enabled, maskable interrupt (INTR). > > • Processor reset (RESET). > > • Processor initialization (INIT). > > • System-management interrupt (SMI)." > > > > The KVM_REQ_UNBLOCK patch will resume execution even any such event > > even without any such event > > > occuring. So the behaviour would be different from baremetal. > What if we move that kvm_check_request() into kvm_vcpu_check_block()? ---8<--- diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 739e1bd59e8a9..e6fee59b5dab6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11177,9 +11177,6 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) static_call(kvm_x86_smi_allowed)(vcpu, false))) return true; - if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) - return true; - if (kvm_arch_interrupt_allowed(vcpu) && (kvm_cpu_has_interrupt(vcpu) || kvm_guest_apic_has_interrupt(vcpu))) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f68035355c08a..fc5f6bffff7fc 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2925,6 +2925,10 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) kvm_make_request(KVM_REQ_UNHALT, vcpu); goto out; } +#ifdef CONFIG_X86 + if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) + return true; +#endif if (kvm_cpu_has_pending_timer(vcpu)) goto out; if (signal_pending(current)) ---8<--- (The CONFIG_X86 is ugly indeed.. but just to show what I meant, e.g. it can be a boolean too I think) Would this work? Thanks, -- Peter Xu ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-12 0:38 ` Peter Xu @ 2021-05-12 11:10 ` Marcelo Tosatti 0 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-12 11:10 UTC (permalink / raw) To: Peter Xu Cc: Paolo Bonzini, Sean Christopherson, kvm, Alex Williamson, Pei Zhang On Tue, May 11, 2021 at 08:38:16PM -0400, Peter Xu wrote: > On Tue, May 11, 2021 at 09:02:59PM -0300, Marcelo Tosatti wrote: > > On Tue, May 11, 2021 at 08:51:24PM -0300, Marcelo Tosatti wrote: > > > On Tue, May 11, 2021 at 05:35:41PM -0400, Peter Xu wrote: > > > > On Tue, May 11, 2021 at 02:18:10PM -0300, Marcelo Tosatti wrote: > > > > > On Tue, May 11, 2021 at 12:19:56PM -0400, Peter Xu wrote: > > > > > > On Tue, May 11, 2021 at 11:51:57AM -0300, Marcelo Tosatti wrote: > > > > > > > On Tue, May 11, 2021 at 10:39:11AM -0400, Peter Xu wrote: > > > > > > > > On Fri, May 07, 2021 at 07:08:31PM -0300, Marcelo Tosatti wrote: > > > > > > > > > > Wondering whether we should add a pi_test_on() check in kvm_vcpu_has_events() > > > > > > > > > > somehow, so that even without customized ->vcpu_check_block we should be able > > > > > > > > > > to break the block loop (as kvm_arch_vcpu_runnable will return true properly)? > > > > > > > > > > > > > > > > > > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > > > > > > > > > { > > > > > > > > > int ret = -EINTR; > > > > > > > > > int idx = srcu_read_lock(&vcpu->kvm->srcu); > > > > > > > > > > > > > > > > > > if (kvm_arch_vcpu_runnable(vcpu)) { > > > > > > > > > kvm_make_request(KVM_REQ_UNHALT, vcpu); <--- > > > > > > > > > goto out; > > > > > > > > > } > > > > > > > > > > > > > > > > > > Don't want to unhalt the vcpu. > > > > > > > > > > > > > > > > Could you elaborate? It's not obvious to me why we can't do that if > > > > > > > > pi_test_on() returns true.. we have pending post interrupts anyways, so > > > > > > > > shouldn't we stop halting? Thanks! > > > > > > > > > > > > > > pi_test_on() only returns true when an interrupt is signalled by the > > > > > > > device. But the sequence of events is: > > > > > > > > > > > > > > > > > > > > > 1. pCPU idles without notification vector configured to wakeup vector. > > > > > > > > > > > > > > 2. PCI device is hotplugged, assigned device count increases from 0 to 1. > > > > > > > > > > > > > > <arbitrary amount of time> > > > > > > > > > > > > > > 3. device generates interrupt, sets ON bit to true in the posted > > > > > > > interrupt descriptor. > > > > > > > > > > > > > > We want to exit kvm_vcpu_block after 2, but before 3 (where ON bit > > > > > > > is not set). > > > > > > > > > > > > Ah yes.. thanks. > > > > > > > > > > > > Besides the current approach, I'm thinking maybe it'll be cleaner/less LOC to > > > > > > define a KVM_REQ_UNBLOCK to replace the pre_block hook (in x86's kvm_host.h): > > > > > > > > > > > > #define KVM_REQ_UNBLOCK KVM_ARCH_REQ(31) > > > > > > > > > > > > We can set it in vmx_pi_start_assignment(), then check+clear it in > > > > > > kvm_vcpu_has_events() (or make it a bool in kvm_vcpu struct?). > > > > > > > > > > Can't check it in kvm_vcpu_has_events() because that will set > > > > > KVM_REQ_UNHALT (which we don't want). > > > > > > > > I thought it was okay to break the guest HLT? > > > > > > Intel: > > > > > > "HLT-HALT > > > > > > Description > > > > > > Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and > > > SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an > > > interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer > > > (CS:EIP) points to the instruction following the HLT instruction." > > > > > > AMD: > > > > > > "6.5 Processor Halt > > > The processor halt instruction (HLT) halts instruction execution, leaving the processor in the halt state. > > > No registers or machine state are modified as a result of executing the HLT instruction. The processor > > > remains in the halt state until one of the following occurs: > > > • A non-maskable interrupt (NMI). > > > • An enabled, maskable interrupt (INTR). > > > • Processor reset (RESET). > > > • Processor initialization (INIT). > > > • System-management interrupt (SMI)." > > > > > > The KVM_REQ_UNBLOCK patch will resume execution even any such event > > > > even without any such event > > > > > occuring. So the behaviour would be different from baremetal. > > > > What if we move that kvm_check_request() into kvm_vcpu_check_block()? > > ---8<--- > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 739e1bd59e8a9..e6fee59b5dab6 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11177,9 +11177,6 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) > static_call(kvm_x86_smi_allowed)(vcpu, false))) > return true; > > - if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) > - return true; > - > if (kvm_arch_interrupt_allowed(vcpu) && > (kvm_cpu_has_interrupt(vcpu) || > kvm_guest_apic_has_interrupt(vcpu))) > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index f68035355c08a..fc5f6bffff7fc 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2925,6 +2925,10 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > kvm_make_request(KVM_REQ_UNHALT, vcpu); > goto out; > } > +#ifdef CONFIG_X86 > + if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) > + return true; > +#endif > if (kvm_cpu_has_pending_timer(vcpu)) > goto out; > if (signal_pending(current)) > ---8<--- > > (The CONFIG_X86 is ugly indeed.. but just to show what I meant, e.g. it can be > a boolean too I think) > > Would this work? That would work: but vcpu->requests are nicely checked (and processed) at vcpu_enter_guest, before guest entry. The proposed request does not follow that pattern. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-12 0:02 ` Marcelo Tosatti 2021-05-12 0:38 ` Peter Xu @ 2021-05-12 14:41 ` Sean Christopherson 2021-05-12 15:34 ` Peter Xu 1 sibling, 1 reply; 26+ messages in thread From: Sean Christopherson @ 2021-05-12 14:41 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Peter Xu, Paolo Bonzini, kvm, Alex Williamson, Pei Zhang On Tue, May 11, 2021, Marcelo Tosatti wrote: > > The KVM_REQ_UNBLOCK patch will resume execution even any such event > > even without any such event > > > occuring. So the behaviour would be different from baremetal. I agree with Marcelo, we don't want to spuriously unhalt the vCPU. It's legal, albeit risky, to do something like hlt /* #UD to triple fault if this CPU is awakened. */ ud2 when offlining a CPU, in which case the spurious wake event will crash the guest. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device 2021-05-12 14:41 ` Sean Christopherson @ 2021-05-12 15:34 ` Peter Xu 0 siblings, 0 replies; 26+ messages in thread From: Peter Xu @ 2021-05-12 15:34 UTC (permalink / raw) To: Sean Christopherson Cc: Marcelo Tosatti, Paolo Bonzini, kvm, Alex Williamson, Pei Zhang On Wed, May 12, 2021 at 02:41:56PM +0000, Sean Christopherson wrote: > On Tue, May 11, 2021, Marcelo Tosatti wrote: > > > The KVM_REQ_UNBLOCK patch will resume execution even any such event > > > > even without any such event > > > > > occuring. So the behaviour would be different from baremetal. > > I agree with Marcelo, we don't want to spuriously unhalt the vCPU. It's legal, > albeit risky, to do something like > > hlt > /* #UD to triple fault if this CPU is awakened. */ > ud2 > > when offlining a CPU, in which case the spurious wake event will crash the guest. We can avoid that by moving the check+clear of KVM_REQ_UNBLOCK from kvm_vcpu_has_events() into kvm_vcpu_check_block() as replied in the other thread. But I also agree Marcelo's series should work already to fix the bug, hence no strong opinion on this. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v3) @ 2021-05-10 17:26 Marcelo Tosatti 2021-05-10 17:26 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-10 17:26 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Peter Xu Configuration of the posted interrupt descriptor is incorrect when devices are hotplugged to the guest (and vcpus are halted). See patch 4 for details. --- v3: improved comments (Sean) use kvm_vcpu_wake_up (Sean) drop device_count from start_assignment function (Peter Xu) v2: rather than using a potentially racy IPI (vs vcpu->cpu switches), kick the vcpus when assigning a device and let the blocked per-CPU list manipulation happen locally at ->pre_block and ->post_block (Sean Christopherson). ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-10 17:26 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v3) Marcelo Tosatti @ 2021-05-10 17:26 ` Marcelo Tosatti 2021-05-11 16:26 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-10 17:26 UTC (permalink / raw) To: kvm Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Peter Xu, Marcelo Tosatti Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done. The hook is required to update the wakeup vector of a sleeping vCPU when a device is assigned to the guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/include/asm/kvm_host.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); + void (*start_assignment)(struct kvm *kvm); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); Index: kvm/arch/x86/kvm/svm/svm.c =================================================================== --- kvm.orig/arch/x86/kvm/svm/svm.c +++ kvm/arch/x86/kvm/svm/svm.c @@ -4601,6 +4601,7 @@ static struct kvm_x86_ops svm_x86_ops __ .deliver_posted_interrupt = svm_deliver_avic_intr, .dy_apicv_has_pending_interrupt = svm_dy_apicv_has_pending_interrupt, .update_pi_irte = svm_update_pi_irte, + .start_assignment = NULL, .setup_mce = svm_setup_mce, .smi_allowed = svm_smi_allowed, Index: kvm/arch/x86/kvm/vmx/vmx.c =================================================================== --- kvm.orig/arch/x86/kvm/vmx/vmx.c +++ kvm/arch/x86/kvm/vmx/vmx.c @@ -7732,6 +7732,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .nested_ops = &vmx_nested_ops, .update_pi_irte = pi_update_irte, + .start_assignment = NULL, #ifdef CONFIG_X86_64 .set_hv_timer = vmx_set_hv_timer, Index: kvm/arch/x86/kvm/x86.c =================================================================== --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -11295,7 +11295,11 @@ bool kvm_arch_can_dequeue_async_page_pre void kvm_arch_start_assignment(struct kvm *kvm) { - atomic_inc(&kvm->arch.assigned_device_count); + int ret; + + ret = atomic_inc_return(&kvm->arch.assigned_device_count); + if (ret == 1) + static_call_cond(kvm_x86_start_assignment)(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_start_assignment); Index: kvm/arch/x86/include/asm/kvm-x86-ops.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm-x86-ops.h +++ kvm/arch/x86/include/asm/kvm-x86-ops.h @@ -99,6 +99,7 @@ KVM_X86_OP_NULL(post_block) KVM_X86_OP_NULL(vcpu_blocking) KVM_X86_OP_NULL(vcpu_unblocking) KVM_X86_OP_NULL(update_pi_irte) +KVM_X86_OP_NULL(start_assignment) KVM_X86_OP_NULL(apicv_post_state_restore) KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt) KVM_X86_OP_NULL(set_hv_timer) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-10 17:26 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti @ 2021-05-11 16:26 ` Peter Xu 2021-05-11 17:29 ` Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Peter Xu @ 2021-05-11 16:26 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Alex Williamson, Sean Christopherson On Mon, May 10, 2021 at 02:26:47PM -0300, Marcelo Tosatti wrote: > Add a start_assignment hook to kvm_x86_ops, which is called when > kvm_arch_start_assignment is done. > > The hook is required to update the wakeup vector of a sleeping vCPU > when a device is assigned to the guest. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > Index: kvm/arch/x86/include/asm/kvm_host.h > =================================================================== > --- kvm.orig/arch/x86/include/asm/kvm_host.h > +++ kvm/arch/x86/include/asm/kvm_host.h > @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { > > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > uint32_t guest_irq, bool set); > + void (*start_assignment)(struct kvm *kvm); > void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); > bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); > > Index: kvm/arch/x86/kvm/svm/svm.c > =================================================================== > --- kvm.orig/arch/x86/kvm/svm/svm.c > +++ kvm/arch/x86/kvm/svm/svm.c > @@ -4601,6 +4601,7 @@ static struct kvm_x86_ops svm_x86_ops __ > .deliver_posted_interrupt = svm_deliver_avic_intr, > .dy_apicv_has_pending_interrupt = svm_dy_apicv_has_pending_interrupt, > .update_pi_irte = svm_update_pi_irte, > + .start_assignment = NULL, Can this be dropped (as default NULL)? > .setup_mce = svm_setup_mce, > > .smi_allowed = svm_smi_allowed, > Index: kvm/arch/x86/kvm/vmx/vmx.c > =================================================================== > --- kvm.orig/arch/x86/kvm/vmx/vmx.c > +++ kvm/arch/x86/kvm/vmx/vmx.c > @@ -7732,6 +7732,7 @@ static struct kvm_x86_ops vmx_x86_ops __ > .nested_ops = &vmx_nested_ops, > > .update_pi_irte = pi_update_irte, > + .start_assignment = NULL, Same here? > > #ifdef CONFIG_X86_64 > .set_hv_timer = vmx_set_hv_timer, > Index: kvm/arch/x86/kvm/x86.c > =================================================================== > --- kvm.orig/arch/x86/kvm/x86.c > +++ kvm/arch/x86/kvm/x86.c > @@ -11295,7 +11295,11 @@ bool kvm_arch_can_dequeue_async_page_pre > > void kvm_arch_start_assignment(struct kvm *kvm) > { > - atomic_inc(&kvm->arch.assigned_device_count); > + int ret; > + > + ret = atomic_inc_return(&kvm->arch.assigned_device_count); > + if (ret == 1) > + static_call_cond(kvm_x86_start_assignment)(kvm); Maybe "ret" can be dropped too? void kvm_arch_start_assignment(struct kvm *kvm) { if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1) static_call_cond(kvm_x86_start_assignment)(kvm); } Otherwise looks good to me. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-11 16:26 ` Peter Xu @ 2021-05-11 17:29 ` Marcelo Tosatti 0 siblings, 0 replies; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 17:29 UTC (permalink / raw) To: Peter Xu; +Cc: kvm, Paolo Bonzini, Alex Williamson, Sean Christopherson On Tue, May 11, 2021 at 12:26:08PM -0400, Peter Xu wrote: > On Mon, May 10, 2021 at 02:26:47PM -0300, Marcelo Tosatti wrote: > > Add a start_assignment hook to kvm_x86_ops, which is called when > > kvm_arch_start_assignment is done. > > > > The hook is required to update the wakeup vector of a sleeping vCPU > > when a device is assigned to the guest. > > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > > > Index: kvm/arch/x86/include/asm/kvm_host.h > > =================================================================== > > --- kvm.orig/arch/x86/include/asm/kvm_host.h > > +++ kvm/arch/x86/include/asm/kvm_host.h > > @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { > > > > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > > uint32_t guest_irq, bool set); > > + void (*start_assignment)(struct kvm *kvm); > > void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); > > bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); > > > > Index: kvm/arch/x86/kvm/svm/svm.c > > =================================================================== > > --- kvm.orig/arch/x86/kvm/svm/svm.c > > +++ kvm/arch/x86/kvm/svm/svm.c > > @@ -4601,6 +4601,7 @@ static struct kvm_x86_ops svm_x86_ops __ > > .deliver_posted_interrupt = svm_deliver_avic_intr, > > .dy_apicv_has_pending_interrupt = svm_dy_apicv_has_pending_interrupt, > > .update_pi_irte = svm_update_pi_irte, > > + .start_assignment = NULL, > > Can this be dropped (as default NULL)? Done. > > > .setup_mce = svm_setup_mce, > > > > .smi_allowed = svm_smi_allowed, > > Index: kvm/arch/x86/kvm/vmx/vmx.c > > =================================================================== > > --- kvm.orig/arch/x86/kvm/vmx/vmx.c > > +++ kvm/arch/x86/kvm/vmx/vmx.c > > @@ -7732,6 +7732,7 @@ static struct kvm_x86_ops vmx_x86_ops __ > > .nested_ops = &vmx_nested_ops, > > > > .update_pi_irte = pi_update_irte, > > + .start_assignment = NULL, > > Same here? Done. > > > > #ifdef CONFIG_X86_64 > > .set_hv_timer = vmx_set_hv_timer, > > Index: kvm/arch/x86/kvm/x86.c > > =================================================================== > > --- kvm.orig/arch/x86/kvm/x86.c > > +++ kvm/arch/x86/kvm/x86.c > > @@ -11295,7 +11295,11 @@ bool kvm_arch_can_dequeue_async_page_pre > > > > void kvm_arch_start_assignment(struct kvm *kvm) > > { > > - atomic_inc(&kvm->arch.assigned_device_count); > > + int ret; > > + > > + if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1) > > + if (ret == 1) > > + static_call_cond(kvm_x86_start_assignment)(kvm); > > Maybe "ret" can be dropped too? > > void kvm_arch_start_assignment(struct kvm *kvm) > { > if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1) > static_call_cond(kvm_x86_start_assignment)(kvm); > } > > Otherwise looks good to me. Thanks, Done. ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v4) @ 2021-05-11 23:57 Marcelo Tosatti 2021-05-11 23:57 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 23:57 UTC (permalink / raw) To: kvm; +Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Peter Xu Configuration of the posted interrupt descriptor is incorrect when devices are hotplugged to the guest (and vcpus are halted). See patch 4 for details. --- v4: remove NULL assignments from kvm_x86_ops (Peter Xu) check for return value of ->start_assignment directly (Peter Xu) v3: improved comments (Sean) use kvm_vcpu_wake_up (Sean) drop device_count from start_assignment function (Peter Xu) v2: rather than using a potentially racy IPI (vs vcpu->cpu switches), kick the vcpus when assigning a device and let the blocked per-CPU list manipulation happen locally at ->pre_block and ->post_block (Sean Christopherson). ^ permalink raw reply [flat|nested] 26+ messages in thread
* [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-11 23:57 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v4) Marcelo Tosatti @ 2021-05-11 23:57 ` Marcelo Tosatti 2021-05-12 15:30 ` Peter Xu 0 siblings, 1 reply; 26+ messages in thread From: Marcelo Tosatti @ 2021-05-11 23:57 UTC (permalink / raw) To: kvm Cc: Paolo Bonzini, Alex Williamson, Sean Christopherson, Peter Xu, Marcelo Tosatti Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done. The hook is required to update the wakeup vector of a sleeping vCPU when a device is assigned to the guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Index: kvm/arch/x86/include/asm/kvm_host.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm_host.h +++ kvm/arch/x86/include/asm/kvm_host.h @@ -1322,6 +1322,7 @@ struct kvm_x86_ops { int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); + void (*start_assignment)(struct kvm *kvm); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); Index: kvm/arch/x86/kvm/x86.c =================================================================== --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -11295,7 +11295,8 @@ bool kvm_arch_can_dequeue_async_page_pre void kvm_arch_start_assignment(struct kvm *kvm) { - atomic_inc(&kvm->arch.assigned_device_count); + if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1) + static_call_cond(kvm_x86_start_assignment)(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_start_assignment); Index: kvm/arch/x86/include/asm/kvm-x86-ops.h =================================================================== --- kvm.orig/arch/x86/include/asm/kvm-x86-ops.h +++ kvm/arch/x86/include/asm/kvm-x86-ops.h @@ -99,6 +99,7 @@ KVM_X86_OP_NULL(post_block) KVM_X86_OP_NULL(vcpu_blocking) KVM_X86_OP_NULL(vcpu_unblocking) KVM_X86_OP_NULL(update_pi_irte) +KVM_X86_OP_NULL(start_assignment) KVM_X86_OP_NULL(apicv_post_state_restore) KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt) KVM_X86_OP_NULL(set_hv_timer) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops 2021-05-11 23:57 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti @ 2021-05-12 15:30 ` Peter Xu 0 siblings, 0 replies; 26+ messages in thread From: Peter Xu @ 2021-05-12 15:30 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Alex Williamson, Sean Christopherson On Tue, May 11, 2021 at 08:57:39PM -0300, Marcelo Tosatti wrote: > Add a start_assignment hook to kvm_x86_ops, which is called when > kvm_arch_start_assignment is done. > > The hook is required to update the wakeup vector of a sleeping vCPU > when a device is assigned to the guest. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2021-05-12 15:39 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti 2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 2021-05-07 19:16 ` Peter Xu 2021-05-10 17:53 ` Marcelo Tosatti 2021-05-07 13:06 ` [patch 2/4] KVM: add arch specific vcpu_check_block callback Marcelo Tosatti 2021-05-07 13:06 ` [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback Marcelo Tosatti 2021-05-07 13:06 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti 2021-05-07 17:22 ` Sean Christopherson 2021-05-07 19:29 ` Peter Xu 2021-05-07 22:08 ` Marcelo Tosatti 2021-05-11 14:39 ` Peter Xu 2021-05-11 14:51 ` Marcelo Tosatti 2021-05-11 16:19 ` Peter Xu 2021-05-11 17:18 ` Marcelo Tosatti 2021-05-11 21:35 ` Peter Xu 2021-05-11 23:51 ` Marcelo Tosatti 2021-05-12 0:02 ` Marcelo Tosatti 2021-05-12 0:38 ` Peter Xu 2021-05-12 11:10 ` Marcelo Tosatti 2021-05-12 14:41 ` Sean Christopherson 2021-05-12 15:34 ` Peter Xu 2021-05-10 17:26 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v3) Marcelo Tosatti 2021-05-10 17:26 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 2021-05-11 16:26 ` Peter Xu 2021-05-11 17:29 ` Marcelo Tosatti 2021-05-11 23:57 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v4) Marcelo Tosatti 2021-05-11 23:57 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti 2021-05-12 15:30 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).