kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Peter Xu <peterx@redhat.com>, Pei Zhang <pezhang@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device
Date: Mon, 10 May 2021 14:26:50 -0300	[thread overview]
Message-ID: <20210510172818.025080848@redhat.com> (raw)
In-Reply-To: 20210510172646.930550753@redhat.com

For VMX, when a vcpu enters HLT emulation, pi_post_block will:

1) Add vcpu to per-cpu list of blocked vcpus.

2) Program the posted-interrupt descriptor "notification vector" 
to POSTED_INTR_WAKEUP_VECTOR

With interrupt remapping, an interrupt will set the PIR bit for the 
vector programmed for the device on the CPU, test-and-set the 
ON bit on the posted interrupt descriptor, and if the ON bit is clear
generate an interrupt for the notification vector.

This way, the target CPU wakes upon a device interrupt and wakes up
the target vcpu.

Problem is that pi_post_block only programs the notification vector
if kvm_arch_has_assigned_device() is true. Its possible for the
following to happen:

1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false,
notification vector is not programmed
2) device is assigned to VM
3) device interrupts vcpu V, sets ON bit
(notification vector not programmed, so pcpu P remains in idle)
4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set,
kvm_vcpu_kick is skipped
5) vcpu 0 busy spins on vcpu V's response for several seconds, until
RCU watchdog NMIs all vCPUs.

To fix this, use the start_assignment kvm_x86_ops callback to kick
vcpus out of the halt loop, so the notification vector is 
properly reprogrammed to the wakeup vector.

Reported-by: Pei Zhang <pezhang@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


Index: kvm/arch/x86/kvm/vmx/posted_intr.c
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/posted_intr.c
+++ kvm/arch/x86/kvm/vmx/posted_intr.c
@@ -204,6 +204,32 @@ void pi_post_block(struct kvm_vcpu *vcpu
 }
 
 /*
+ * Bail out of the block loop if the VM has an assigned
+ * device, but the blocking vCPU didn't reconfigure the
+ * PI.NV to the wakeup vector, i.e. the assigned device
+ * came along after the initial check in vcpu_block().
+ */
+
+int vmx_vcpu_check_block(struct kvm_vcpu *vcpu)
+{
+	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return 0;
+
+	if (!kvm_vcpu_apicv_active(vcpu))
+		return 0;
+
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR)
+		return 0;
+
+	return 1;
+}
+
+/*
  * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
  */
 void pi_wakeup_handler(void)
@@ -236,6 +262,25 @@ bool pi_has_pending_interrupt(struct kvm
 		(pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc));
 }
 
+void vmx_pi_start_assignment(struct kvm *kvm)
+{
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return;
+
+	/*
+	 * Wakeup will cause the vCPU to bail out of kvm_vcpu_block() and
+	 * go back through vcpu_block().
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		if (!kvm_vcpu_apicv_active(vcpu))
+			continue;
+
+		kvm_vcpu_wake_up(vcpu);
+	}
+}
 
 /*
  * pi_update_irte - set IRTE for Posted-Interrupts
Index: kvm/arch/x86/kvm/vmx/posted_intr.h
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/posted_intr.h
+++ kvm/arch/x86/kvm/vmx/posted_intr.h
@@ -95,5 +95,7 @@ void __init pi_init_cpu(int cpu);
 bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
 int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
 		   bool set);
+void vmx_pi_start_assignment(struct kvm *kvm);
+int vmx_vcpu_check_block(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_X86_VMX_POSTED_INTR_H */
Index: kvm/arch/x86/kvm/vmx/vmx.c
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/vmx.c
+++ kvm/arch/x86/kvm/vmx/vmx.c
@@ -7727,13 +7727,13 @@ static struct kvm_x86_ops vmx_x86_ops __
 
 	.pre_block = vmx_pre_block,
 	.post_block = vmx_post_block,
-	.vcpu_check_block = NULL,
+	.vcpu_check_block = vmx_vcpu_check_block,
 
 	.pmu_ops = &intel_pmu_ops,
 	.nested_ops = &vmx_nested_ops,
 
 	.update_pi_irte = pi_update_irte,
-	.start_assignment = NULL,
+	.start_assignment = vmx_pi_start_assignment,
 
 #ifdef CONFIG_X86_64
 	.set_hv_timer = vmx_set_hv_timer,



  parent reply	other threads:[~2021-05-10 17:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 17:26 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v3) Marcelo Tosatti
2021-05-10 17:26 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti
2021-05-11 16:26   ` Peter Xu
2021-05-11 17:29     ` Marcelo Tosatti
2021-05-10 17:26 ` [patch 2/4] KVM: add arch specific vcpu_check_block callback Marcelo Tosatti
2021-05-10 17:26 ` [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback Marcelo Tosatti
2021-05-10 17:26 ` Marcelo Tosatti [this message]
2021-05-24 15:55   ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Paolo Bonzini
2021-05-24 17:53     ` Marcelo Tosatti
2021-05-25 11:58       ` Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2021-05-11 23:57 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v4) Marcelo Tosatti
2021-05-11 23:57 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti
2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt " Marcelo Tosatti
2021-05-07 13:06 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt " Marcelo Tosatti
2021-05-07 17:22   ` Sean Christopherson
2021-05-07 19:29     ` Peter Xu
2021-05-07 22:08       ` Marcelo Tosatti
2021-05-11 14:39         ` Peter Xu
2021-05-11 14:51           ` Marcelo Tosatti
2021-05-11 16:19             ` Peter Xu
2021-05-11 17:18               ` Marcelo Tosatti
2021-05-11 21:35                 ` Peter Xu
2021-05-11 23:51                   ` Marcelo Tosatti
2021-05-12  0:02                     ` Marcelo Tosatti
2021-05-12  0:38                       ` Peter Xu
2021-05-12 11:10                         ` Marcelo Tosatti
2021-05-12 14:41                       ` Sean Christopherson
2021-05-12 15:34                         ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210510172818.025080848@redhat.com \
    --to=mtosatti@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pezhang@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).