All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Pei Zhang <pezhang@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device
Date: Fri, 07 May 2021 10:06:13 -0300	[thread overview]
Message-ID: <20210507130923.528132061@redhat.com> (raw)
In-Reply-To: 20210507130609.269153197@redhat.com

For VMX, when a vcpu enters HLT emulation, pi_post_block will:

1) Add vcpu to per-cpu list of blocked vcpus.

2) Program the posted-interrupt descriptor "notification vector" 
to POSTED_INTR_WAKEUP_VECTOR

With interrupt remapping, an interrupt will set the PIR bit for the 
vector programmed for the device on the CPU, test-and-set the 
ON bit on the posted interrupt descriptor, and if the ON bit is clear
generate an interrupt for the notification vector.

This way, the target CPU wakes upon a device interrupt and wakes up
the target vcpu.

Problem is that pi_post_block only programs the notification vector
if kvm_arch_has_assigned_device() is true. Its possible for the
following to happen:

1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false,
notification vector is not programmed
2) device is assigned to VM
3) device interrupts vcpu V, sets ON bit
(notification vector not programmed, so pcpu P remains in idle)
4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set,
kvm_vcpu_kick is skipped
5) vcpu 0 busy spins on vcpu V's response for several seconds, until
RCU watchdog NMIs all vCPUs.

To fix this, use the start_assignment kvm_x86_ops callback to kick
vcpus out of the halt loop, so the notification vector is 
properly reprogrammed to the wakeup vector.

Reported-by: Pei Zhang <pezhang@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---

v2: add vmx_pi_start_assignment to vmx's kvm_x86_ops

Index: kvm/arch/x86/kvm/vmx/posted_intr.c
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/posted_intr.c
+++ kvm/arch/x86/kvm/vmx/posted_intr.c
@@ -203,6 +203,25 @@ void pi_post_block(struct kvm_vcpu *vcpu
 	local_irq_enable();
 }
 
+int vmx_vcpu_check_block(struct kvm_vcpu *vcpu)
+{
+	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return 0;
+
+	if (!kvm_vcpu_apicv_active(vcpu))
+		return 0;
+
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR)
+		return 0;
+
+	return 1;
+}
+
 /*
  * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
  */
@@ -236,6 +255,26 @@ bool pi_has_pending_interrupt(struct kvm
 		(pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc));
 }
 
+void vmx_pi_start_assignment(struct kvm *kvm, int device_count)
+{
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return;
+
+	/* only care about first device assignment */
+	if (device_count != 1)
+		return;
+
+	/* Update wakeup vector and add vcpu to blocked_vcpu_list */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		if (!kvm_vcpu_apicv_active(vcpu))
+			continue;
+
+		kvm_vcpu_kick(vcpu);
+	}
+}
 
 /*
  * pi_update_irte - set IRTE for Posted-Interrupts
Index: kvm/arch/x86/kvm/vmx/posted_intr.h
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/posted_intr.h
+++ kvm/arch/x86/kvm/vmx/posted_intr.h
@@ -95,5 +95,7 @@ void __init pi_init_cpu(int cpu);
 bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
 int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
 		   bool set);
+void vmx_pi_start_assignment(struct kvm *kvm, int device_count);
+int vmx_vcpu_check_block(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_X86_VMX_POSTED_INTR_H */
Index: kvm/arch/x86/kvm/vmx/vmx.c
===================================================================
--- kvm.orig/arch/x86/kvm/vmx/vmx.c
+++ kvm/arch/x86/kvm/vmx/vmx.c
@@ -7727,13 +7727,13 @@ static struct kvm_x86_ops vmx_x86_ops __
 
 	.pre_block = vmx_pre_block,
 	.post_block = vmx_post_block,
-	.vcpu_check_block = NULL,
+	.vcpu_check_block = vmx_vcpu_check_block,
 
 	.pmu_ops = &intel_pmu_ops,
 	.nested_ops = &vmx_nested_ops,
 
 	.update_pi_irte = pi_update_irte,
-	.start_assignment = NULL,
+	.start_assignment = vmx_pi_start_assignment,
 
 #ifdef CONFIG_X86_64
 	.set_hv_timer = vmx_set_hv_timer,



  parent reply	other threads:[~2021-05-07 13:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 13:06 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device Marcelo Tosatti
2021-05-07 13:06 ` [patch 1/4] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti
2021-05-07 19:16   ` Peter Xu
2021-05-10 17:53     ` Marcelo Tosatti
2021-05-07 13:06 ` [patch 2/4] KVM: add arch specific vcpu_check_block callback Marcelo Tosatti
2021-05-07 13:06 ` [patch 3/4] KVM: x86: implement kvm_arch_vcpu_check_block callback Marcelo Tosatti
2021-05-07 13:06 ` Marcelo Tosatti [this message]
2021-05-07 17:22   ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Sean Christopherson
2021-05-07 19:29     ` Peter Xu
2021-05-07 22:08       ` Marcelo Tosatti
2021-05-11 14:39         ` Peter Xu
2021-05-11 14:51           ` Marcelo Tosatti
2021-05-11 16:19             ` Peter Xu
2021-05-11 17:18               ` Marcelo Tosatti
2021-05-11 21:35                 ` Peter Xu
2021-05-11 23:51                   ` Marcelo Tosatti
2021-05-12  0:02                     ` Marcelo Tosatti
2021-05-12  0:38                       ` Peter Xu
2021-05-12 11:10                         ` Marcelo Tosatti
2021-05-12 14:41                       ` Sean Christopherson
2021-05-12 15:34                         ` Peter Xu
2021-05-10 17:26 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v3) Marcelo Tosatti
2021-05-10 17:26 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti
2021-05-24 15:55   ` Paolo Bonzini
2021-05-24 17:53     ` Marcelo Tosatti
2021-05-25 11:58       ` Paolo Bonzini
2021-05-11 23:57 [patch 0/4] VMX: configure posted interrupt descriptor when assigning device (v4) Marcelo Tosatti
2021-05-11 23:57 ` [patch 4/4] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210507130923.528132061@redhat.com \
    --to=mtosatti@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pezhang@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.