kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs
@ 2021-11-23  0:43 Paolo Bonzini
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-23  0:43 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc

The fixed version of the patches from last week.  To sum up, the issue
is the following.

Now that APICv can be disabled per-CPU (depending on whether it has some
setup that is incompatible) we need to deal with guests having a mix of
vCPUs with enabled/disabled posted interrupts.  For assigned devices,
their posted interrupt configuration must be the same across the whole
VM, so handle posted interrupts by hand on vCPUs with disabled posted
interrupts.

Patches 1-3 handle the regular posted interrupt vector, while patch 4
handles the wakeup vector.

Paolo Bonzini (4):
  KVM: x86: ignore APICv if LAPIC is not enabled
  KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  KVM: x86: check PIR even for vCPUs with disabled APICv
  KVM: x86: Use a stable condition around all VT-d PI paths

 arch/x86/kvm/lapic.c           |  2 +-
 arch/x86/kvm/svm/svm.c         |  1 -
 arch/x86/kvm/vmx/posted_intr.c | 20 ++++++++++---------
 arch/x86/kvm/vmx/vmx.c         | 35 ++++++++++++++++++++++------------
 arch/x86/kvm/x86.c             | 18 ++++++++---------
 5 files changed, 44 insertions(+), 32 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled
  2021-11-23  0:43 [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs Paolo Bonzini
@ 2021-11-23  0:43 ` Paolo Bonzini
  2021-11-25  1:36   ` Sean Christopherson
                     ` (2 more replies)
  2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-23  0:43 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, stable

Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
The one in the reenter-guest fast path invoked the callback
unconditionally even if LAPIC is disabled.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a403d92833f..441f4769173e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
 			break;
 
-		if (vcpu->arch.apicv_active)
+		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
 			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
 
 		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  2021-11-23  0:43 [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs Paolo Bonzini
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
@ 2021-11-23  0:43 ` Paolo Bonzini
  2021-11-29 22:14   ` Sean Christopherson
                     ` (2 more replies)
  2021-11-23  0:43 ` [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv Paolo Bonzini
  2021-11-23  0:43 ` [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths Paolo Bonzini
  3 siblings, 3 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-23  0:43 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, stable

If APICv is disabled for this vCPU, assigned devices may
still attempt to post interrupts.  In that case, we need
to cancel the vmentry and deliver the interrupt with
KVM_REQ_EVENT.  Extend the existing code that handles
injection of L1 interrupts into L2 to cover this case
as well.

vmx_hwapic_irr_update is only called when APICv is active
so it would be confusing to add a check for
vcpu->arch.apicv_active in there.  Instead, just use
vmx_set_rvi directly in vmx_sync_pir_to_irr.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ba66c171d951..cccf1eab58ac 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
 	int max_irr;
 	bool max_irr_updated;
 
-	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
+	if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
 		return -EIO;
 
 	if (pi_test_on(&vmx->pi_desc)) {
@@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
 		smp_mb__after_atomic();
 		max_irr_updated =
 			kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
-
-		/*
-		 * If we are running L2 and L1 has a new pending interrupt
-		 * which can be injected, this may cause a vmexit or it may
-		 * be injected into L2.  Either way, this interrupt will be
-		 * processed via KVM_REQ_EVENT, not RVI, because we do not use
-		 * virtual interrupt delivery to inject L1 interrupts into L2.
-		 */
-		if (is_guest_mode(vcpu) && max_irr_updated)
-			kvm_make_request(KVM_REQ_EVENT, vcpu);
 	} else {
 		max_irr = kvm_lapic_find_highest_irr(vcpu);
+		max_irr_updated = false;
 	}
-	vmx_hwapic_irr_update(vcpu, max_irr);
+
+	/*
+	 * If virtual interrupt delivery is not in use, the interrupt
+	 * will be processed via KVM_REQ_EVENT, not RVI.  This can happen
+	 * in two cases:
+	 *
+	 * 1) If we are running L2 and L1 has a new pending interrupt
+	 * which can be injected, this may cause a vmexit or it may
+	 * be injected into L2.  We do not use virtual interrupt
+	 * delivery to inject L1 interrupts into L2.
+	 *
+	 * 2) If APICv is disabled for this vCPU, assigned devices may
+	 * still attempt to post interrupts.  The posted interrupt
+	 * vector will cause a vmexit and the subsequent entry will
+	 * call sync_pir_to_irr.
+	 */
+	if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
+		vmx_set_rvi(max_irr);
+	else if (max_irr_updated)
+		kvm_make_request(KVM_REQ_EVENT, vcpu);
+
 	return max_irr;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv
  2021-11-23  0:43 [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs Paolo Bonzini
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
  2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
@ 2021-11-23  0:43 ` Paolo Bonzini
  2021-11-29 23:39   ` David Matlack
  2021-11-30  7:58   ` Maxim Levitsky
  2021-11-23  0:43 ` [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths Paolo Bonzini
  3 siblings, 2 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-23  0:43 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, stable

The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
if APICv is disabled on the vCPU that receives it.  In that case, the
interrupt will just cause a vmexit and leave the ON bit set together
with the PIR bit corresponding to the interrupt.

Right now, the interrupt would not be delivered until APICv is re-enabled.
However, fixing this is just a matter of always doing the PIR->IRR
synchronization, even if the vCPU has temporarily disabled APICv.

This is not a problem for performance, or if anything it is an
improvement.  First, in the common case where vcpu->arch.apicv_active is
true, one fewer check has to be performed.  Second, static_call_cond will
elide the function call if APICv is not present or disabled.  Finally,
in the case for AMD hardware we can remove the sync_pir_to_irr callback:
it is only needed for apic_has_interrupt_for_ppr, and that function
already has a fallback for !APICv.

Cc: stable@vger.kernel.org
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/lapic.c   |  2 +-
 arch/x86/kvm/svm/svm.c |  1 -
 arch/x86/kvm/x86.c     | 18 +++++++++---------
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 759952dd1222..f206fc35deff 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
 static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
 {
 	int highest_irr;
-	if (apic->vcpu->arch.apicv_active)
+	if (kvm_x86_ops.sync_pir_to_irr)
 		highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
 	else
 		highest_irr = apic_find_highest_irr(apic);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5630c241d5f6..d0f68d11ec70 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_irr_update = svm_hwapic_irr_update,
 	.hwapic_isr_update = svm_hwapic_isr_update,
-	.sync_pir_to_irr = kvm_lapic_find_highest_irr,
 	.apicv_post_state_restore = avic_post_state_restore,
 
 	.set_tss_addr = svm_set_tss_addr,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 441f4769173e..a8f12c83db4b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
 				    struct kvm_lapic_state *s)
 {
-	if (vcpu->arch.apicv_active)
-		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+	static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
 
 	return kvm_apic_get_state(vcpu, s);
 }
@@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (irqchip_split(vcpu->kvm))
 		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
 	else {
-		if (vcpu->arch.apicv_active)
-			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
 		if (ioapic_in_kernel(vcpu->kvm))
 			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
@@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 	/*
 	 * This handles the case where a posted interrupt was
-	 * notified with kvm_vcpu_kick.
+	 * notified with kvm_vcpu_kick.  Assigned devices can
+	 * use the POSTED_INTR_VECTOR even if APICv is disabled,
+	 * so do it even if APICv is disabled on this vCPU.
 	 */
-	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
-		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+	if (kvm_lapic_enabled(vcpu))
+		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
 
 	if (kvm_vcpu_exit_request(vcpu)) {
 		vcpu->mode = OUTSIDE_GUEST_MODE;
@@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
 			break;
 
-		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
-			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+		if (kvm_lapic_enabled(vcpu))
+			static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
 
 		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
 			exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths
  2021-11-23  0:43 [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs Paolo Bonzini
                   ` (2 preceding siblings ...)
  2021-11-23  0:43 ` [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv Paolo Bonzini
@ 2021-11-23  0:43 ` Paolo Bonzini
  2021-11-29 23:41   ` David Matlack
  2021-11-30  8:05   ` Maxim Levitsky
  3 siblings, 2 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-23  0:43 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, stable

Currently, checks for whether VT-d PI can be used refer to the current
status of the feature in the current vCPU; or they more or less pick
vCPU 0 in case a specific vCPU is not available.

However, these checks do not attempt to synchronize with changes to
the IRTE.  In particular, there is no path that updates the IRTE when
APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
that has APICv disabled, if the wakeup occurs because of an IRTE
that points to a posted interrupt.

To fix this, always go through the VT-d PI path as long as there are
assigned devices and APICv is available on both the host and the VM side.
Since the relevant condition was copied over three times, take the hint
and factor it into a separate function.

Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..1c94783b5a54 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -5,6 +5,7 @@
 #include <asm/cpu.h>
 
 #include "lapic.h"
+#include "irq.h"
 #include "posted_intr.h"
 #include "trace.h"
 #include "vmx.h"
@@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
 		pi_set_on(pi_desc);
 }
 
+static bool vmx_can_use_vtd_pi(struct kvm *kvm)
+{
+	return irqchip_in_kernel(kvm) && enable_apicv &&
+		kvm_arch_has_assigned_device(kvm) &&
+		irq_remapping_cap(IRQ_POSTING_CAP);
+}
+
 void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
 {
 	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
 
-	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
-		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
-		!kvm_vcpu_apicv_active(vcpu))
+	if (!vmx_can_use_vtd_pi(vcpu->kvm))
 		return;
 
 	/* Set SN when the vCPU is preempted */
@@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
 	struct pi_desc old, new;
 	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
 
-	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
-		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
-		!kvm_vcpu_apicv_active(vcpu))
+	if (!vmx_can_use_vtd_pi(vcpu->kvm))
 		return 0;
 
 	WARN_ON(irqs_disabled());
@@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
 	struct vcpu_data vcpu_info;
 	int idx, ret = 0;
 
-	if (!kvm_arch_has_assigned_device(kvm) ||
-	    !irq_remapping_cap(IRQ_POSTING_CAP) ||
-	    !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+	if (!vmx_can_use_vtd_pi(kvm))
 		return 0;
 
 	idx = srcu_read_lock(&kvm->irq_srcu);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
@ 2021-11-25  1:36   ` Sean Christopherson
  2021-11-29 23:32   ` David Matlack
  2021-11-30  7:50   ` Maxim Levitsky
  2 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-11-25  1:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, stable

On Mon, Nov 22, 2021, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
>  			break;
>  
> -		if (vcpu->arch.apicv_active)
> +		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)

Ooooh, lapic _enabled_, not just present.  That took me far too long to read...

Reviewed-by: Sean Christopherson <seanjc@google.com>

>  			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
@ 2021-11-29 22:14   ` Sean Christopherson
  2021-11-30  8:31     ` Paolo Bonzini
  2021-11-29 23:34   ` David Matlack
  2021-11-30  7:57   ` Maxim Levitsky
  2 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2021-11-29 22:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, stable

On Mon, Nov 22, 2021, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts.  In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT.  Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
> 
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there.  Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.

Overzealous newlines.

If APICv is disabled for this vCPU, assigned devices may still attempt to
post interrupts.  In that case, we need to cancel the vmentry and deliver
the interrupt with KVM_REQ_EVENT.  Extend the existing code that handles
injection of L1 interrupts into L2 to cover this case as well.

vmx_hwapic_irr_update is only called when APICv is active so it would be
confusing to add a check for vcpu->arch.apicv_active in there.  Instead,
just use vmx_set_rvi directly in vmx_sync_pir_to_irr.


> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  	int max_irr;
>  	bool max_irr_updated;
>  
> -	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> +	if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
>  		return -EIO;
>  
>  	if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  		smp_mb__after_atomic();
>  		max_irr_updated =
>  			kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> -		/*
> -		 * If we are running L2 and L1 has a new pending interrupt
> -		 * which can be injected, this may cause a vmexit or it may
> -		 * be injected into L2.  Either way, this interrupt will be
> -		 * processed via KVM_REQ_EVENT, not RVI, because we do not use
> -		 * virtual interrupt delivery to inject L1 interrupts into L2.
> -		 */
> -		if (is_guest_mode(vcpu) && max_irr_updated)
> -			kvm_make_request(KVM_REQ_EVENT, vcpu);
>  	} else {
>  		max_irr = kvm_lapic_find_highest_irr(vcpu);
> +		max_irr_updated = false;

Heh, maybe s/max_irr_updated/new_pir_found or so?  This is a bit weird:

  1. Update max_irr
  2. max_irr_updated = false

>  	}
> -	vmx_hwapic_irr_update(vcpu, max_irr);
> +
> +	/*
> +	 * If virtual interrupt delivery is not in use, the interrupt
> +	 * will be processed via KVM_REQ_EVENT, not RVI.  This can happen

I'd strongly prefer to phrase this as a command, e.g. "process the interrupt via
KVM_REQ_EVENT".  "will be processed" makes it sound like some other flow is
handling the event, which confused me.

> +	 * in two cases:
> +	 *
> +	 * 1) If we are running L2 and L1 has a new pending interrupt

Hmm, calling it L1's interrupt isn't technically wrong, but it's a bit confusing
because it's handled in L2.  Maybe just say "vCPU"?

> +	 * which can be injected, this may cause a vmexit or it may

Overzealous newlines again.

> +	 * be injected into L2.  We do not use virtual interrupt
> +	 * delivery to inject L1 interrupts into L2.

I found the part of "may cause a vmexit or may be injected into L2" hard to
follow, I think because this doesn't explicit state that the KVM_REQ_EVENT is
needed to synthesize a VM-Exit.

How about this for the comment?

	/*
	 * If virtual interrupt delivery is not in use, process the interrupt
	 * via KVM_REQ_EVENT, not RVI.  This is necessary in two cases:
	 *
	 * 1) If L2 is running and the vCPU has a new pending interrupt.  If L1
	 * wants to exit on interrupts, KVM_REQ_EVENT is needed to synthesize a
	 * VM-Exit to L1.  If L1 doesn't want to exit, the interrupt is injected
	 * into L2, but KVM doesn't use virtual interrupt delivery to inject
	 * interrupts into L2, and so KVM_REQ_EVENT is again needed.
	 *
	 * 2) If APICv is disabled for this vCPU, assigned devices may still
	 * attempt to post interrupts.  The posted interrupt vector will cause
	 * a VM-Exit and the subsequent entry will call sync_pir_to_irr.
	 */

> +	 *
> +	 * 2) If APICv is disabled for this vCPU, assigned devices may
> +	 * still attempt to post interrupts.  The posted interrupt
> +	 * vector will cause a vmexit and the subsequent entry will
> +	 * call sync_pir_to_irr.
> +	 */
> +	if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)

If the second check uses kvm_vcpu_apicv_active(), this will avoid a silent conflict
when kvm/master and kvm/queue are merged.

Nits aside, the logic is good:

Reviewed-by: Sean Christopherson <seanjc@google.com>

> +		vmx_set_rvi(max_irr);
> +	else if (max_irr_updated)
> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
>  	return max_irr;
>  }
>  
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
  2021-11-25  1:36   ` Sean Christopherson
@ 2021-11-29 23:32   ` David Matlack
  2021-11-30  7:50   ` Maxim Levitsky
  2 siblings, 0 replies; 16+ messages in thread
From: David Matlack @ 2021-11-29 23:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, stable

On Mon, Nov 22, 2021 at 07:43:08PM -0500, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
>  			break;
>  
> -		if (vcpu->arch.apicv_active)
> +		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
>  			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
  2021-11-29 22:14   ` Sean Christopherson
@ 2021-11-29 23:34   ` David Matlack
  2021-11-30  7:57   ` Maxim Levitsky
  2 siblings, 0 replies; 16+ messages in thread
From: David Matlack @ 2021-11-29 23:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, stable

On Mon, Nov 22, 2021 at 07:43:09PM -0500, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts.  In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT.  Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
> 
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there.  Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

(Although I agree with Sean's suggestions and 1 nit below.)

> ---
>  arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  	int max_irr;
>  	bool max_irr_updated;
>  
> -	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> +	if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
>  		return -EIO;
>  
>  	if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  		smp_mb__after_atomic();
>  		max_irr_updated =
>  			kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> -		/*
> -		 * If we are running L2 and L1 has a new pending interrupt
> -		 * which can be injected, this may cause a vmexit or it may
> -		 * be injected into L2.  Either way, this interrupt will be
> -		 * processed via KVM_REQ_EVENT, not RVI, because we do not use
> -		 * virtual interrupt delivery to inject L1 interrupts into L2.
> -		 */
> -		if (is_guest_mode(vcpu) && max_irr_updated)
> -			kvm_make_request(KVM_REQ_EVENT, vcpu);
>  	} else {
>  		max_irr = kvm_lapic_find_highest_irr(vcpu);
> +		max_irr_updated = false;
>  	}
> -	vmx_hwapic_irr_update(vcpu, max_irr);
> +
> +	/*
> +	 * If virtual interrupt delivery is not in use, the interrupt
> +	 * will be processed via KVM_REQ_EVENT, not RVI.  This can happen
> +	 * in two cases:
> +	 *
> +	 * 1) If we are running L2 and L1 has a new pending interrupt
> +	 * which can be injected, this may cause a vmexit or it may
> +	 * be injected into L2.  We do not use virtual interrupt
> +	 * delivery to inject L1 interrupts into L2.
> +	 *
> +	 * 2) If APICv is disabled for this vCPU, assigned devices may
> +	 * still attempt to post interrupts.  The posted interrupt
> +	 * vector will cause a vmexit and the subsequent entry will
> +	 * call sync_pir_to_irr.
> +	 */
> +	if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
> +		vmx_set_rvi(max_irr);
> +	else if (max_irr_updated)
> +		kvm_make_request(KVM_REQ_EVENT, vcpu);

nit: In the version of this code that is currently in kvm/queue the
indentation of the previous 3 lines uses spaces instead of tabs. I see
tabs in this mail though.

> +
>  	return max_irr;
>  }
>  
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv
  2021-11-23  0:43 ` [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv Paolo Bonzini
@ 2021-11-29 23:39   ` David Matlack
  2021-11-30  7:58   ` Maxim Levitsky
  1 sibling, 0 replies; 16+ messages in thread
From: David Matlack @ 2021-11-29 23:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, stable

On Mon, Nov 22, 2021 at 07:43:10PM -0500, Paolo Bonzini wrote:
> The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
> if APICv is disabled on the vCPU that receives it.  In that case, the
> interrupt will just cause a vmexit and leave the ON bit set together
> with the PIR bit corresponding to the interrupt.
> 
> Right now, the interrupt would not be delivered until APICv is re-enabled.
> However, fixing this is just a matter of always doing the PIR->IRR
> synchronization, even if the vCPU has temporarily disabled APICv.
> 
> This is not a problem for performance, or if anything it is an
> improvement.  First, in the common case where vcpu->arch.apicv_active is
> true, one fewer check has to be performed.  Second, static_call_cond will
> elide the function call if APICv is not present or disabled.  Finally,
> in the case for AMD hardware we can remove the sync_pir_to_irr callback:
> it is only needed for apic_has_interrupt_for_ppr, and that function
> already has a fallback for !APICv.
> 
> Cc: stable@vger.kernel.org
> Co-developed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/lapic.c   |  2 +-
>  arch/x86/kvm/svm/svm.c |  1 -
>  arch/x86/kvm/x86.c     | 18 +++++++++---------
>  3 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..f206fc35deff 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
>  static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
>  {
>  	int highest_irr;
> -	if (apic->vcpu->arch.apicv_active)
> +	if (kvm_x86_ops.sync_pir_to_irr)
>  		highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
>  	else
>  		highest_irr = apic_find_highest_irr(apic);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5630c241d5f6..d0f68d11ec70 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.load_eoi_exitmap = svm_load_eoi_exitmap,
>  	.hwapic_irr_update = svm_hwapic_irr_update,
>  	.hwapic_isr_update = svm_hwapic_isr_update,
> -	.sync_pir_to_irr = kvm_lapic_find_highest_irr,
>  	.apicv_post_state_restore = avic_post_state_restore,
>  
>  	.set_tss_addr = svm_set_tss_addr,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 441f4769173e..a8f12c83db4b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>  				    struct kvm_lapic_state *s)
>  {
> -	if (vcpu->arch.apicv_active)
> -		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +	static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  	return kvm_apic_get_state(vcpu, s);
>  }
> @@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>  	if (irqchip_split(vcpu->kvm))
>  		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
>  	else {
> -		if (vcpu->arch.apicv_active)
> -			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  		if (ioapic_in_kernel(vcpu->kvm))
>  			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>  	}
> @@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  	/*
>  	 * This handles the case where a posted interrupt was
> -	 * notified with kvm_vcpu_kick.
> +	 * notified with kvm_vcpu_kick.  Assigned devices can
> +	 * use the POSTED_INTR_VECTOR even if APICv is disabled,
> +	 * so do it even if APICv is disabled on this vCPU.
>  	 */
> -	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> -		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +	if (kvm_lapic_enabled(vcpu))
> +		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  	if (kvm_vcpu_exit_request(vcpu)) {
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
> @@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
>  			break;
>  
> -		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> -			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +		if (kvm_lapic_enabled(vcpu))
> +			static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
>  			exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths
  2021-11-23  0:43 ` [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths Paolo Bonzini
@ 2021-11-29 23:41   ` David Matlack
  2021-11-30  8:05   ` Maxim Levitsky
  1 sibling, 0 replies; 16+ messages in thread
From: David Matlack @ 2021-11-29 23:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, stable

On Mon, Nov 22, 2021 at 07:43:11PM -0500, Paolo Bonzini wrote:
> Currently, checks for whether VT-d PI can be used refer to the current
> status of the feature in the current vCPU; or they more or less pick
> vCPU 0 in case a specific vCPU is not available.
> 
> However, these checks do not attempt to synchronize with changes to
> the IRTE.  In particular, there is no path that updates the IRTE when
> APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
> that has APICv disabled, if the wakeup occurs because of an IRTE
> that points to a posted interrupt.
> 
> To fix this, always go through the VT-d PI path as long as there are
> assigned devices and APICv is available on both the host and the VM side.
> Since the relevant condition was copied over three times, take the hint
> and factor it into a separate function.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..1c94783b5a54 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -5,6 +5,7 @@
>  #include <asm/cpu.h>
>  
>  #include "lapic.h"
> +#include "irq.h"
>  #include "posted_intr.h"
>  #include "trace.h"
>  #include "vmx.h"
> @@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>  		pi_set_on(pi_desc);
>  }
>  
> +static bool vmx_can_use_vtd_pi(struct kvm *kvm)
> +{
> +	return irqchip_in_kernel(kvm) && enable_apicv &&
> +		kvm_arch_has_assigned_device(kvm) &&
> +		irq_remapping_cap(IRQ_POSTING_CAP);
> +}
> +
>  void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
>  {
>  	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>  
> -	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> -		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
> -		!kvm_vcpu_apicv_active(vcpu))
> +	if (!vmx_can_use_vtd_pi(vcpu->kvm))
>  		return;
>  
>  	/* Set SN when the vCPU is preempted */
> @@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
>  	struct pi_desc old, new;
>  	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>  
> -	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> -		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
> -		!kvm_vcpu_apicv_active(vcpu))
> +	if (!vmx_can_use_vtd_pi(vcpu->kvm))
>  		return 0;
>  
>  	WARN_ON(irqs_disabled());
> @@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
>  	struct vcpu_data vcpu_info;
>  	int idx, ret = 0;
>  
> -	if (!kvm_arch_has_assigned_device(kvm) ||
> -	    !irq_remapping_cap(IRQ_POSTING_CAP) ||
> -	    !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> +	if (!vmx_can_use_vtd_pi(kvm))
>  		return 0;
>  
>  	idx = srcu_read_lock(&kvm->irq_srcu);
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled
  2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
  2021-11-25  1:36   ` Sean Christopherson
  2021-11-29 23:32   ` David Matlack
@ 2021-11-30  7:50   ` Maxim Levitsky
  2 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2021-11-30  7:50 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, stable

On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
>  			break;
>  
> -		if (vcpu->arch.apicv_active)
> +		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
>  			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
Reviewed-by: Maxim Levitsky <mlevitk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
  2021-11-29 22:14   ` Sean Christopherson
  2021-11-29 23:34   ` David Matlack
@ 2021-11-30  7:57   ` Maxim Levitsky
  2 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2021-11-30  7:57 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, stable

On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts.  In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT.  Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
> 
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there.  Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  	int max_irr;
>  	bool max_irr_updated;
>  
> -	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> +	if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
>  		return -EIO;
>  
>  	if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  		smp_mb__after_atomic();
>  		max_irr_updated =
>  			kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> -		/*
> -		 * If we are running L2 and L1 has a new pending interrupt
> -		 * which can be injected, this may cause a vmexit or it may
> -		 * be injected into L2.  Either way, this interrupt will be
> -		 * processed via KVM_REQ_EVENT, not RVI, because we do not use
> -		 * virtual interrupt delivery to inject L1 interrupts into L2.
> -		 */
> -		if (is_guest_mode(vcpu) && max_irr_updated)
> -			kvm_make_request(KVM_REQ_EVENT, vcpu);
>  	} else {
>  		max_irr = kvm_lapic_find_highest_irr(vcpu);
> +		max_irr_updated = false;
>  	}
> -	vmx_hwapic_irr_update(vcpu, max_irr);
> +
> +	/*
> +	 * If virtual interrupt delivery is not in use, the interrupt
> +	 * will be processed via KVM_REQ_EVENT, not RVI.  This can happen
> +	 * in two cases:
> +	 *
> +	 * 1) If we are running L2 and L1 has a new pending interrupt
> +	 * which can be injected, this may cause a vmexit or it may
> +	 * be injected into L2.  We do not use virtual interrupt
> +	 * delivery to inject L1 interrupts into L2.
> +	 *
> +	 * 2) If APICv is disabled for this vCPU, assigned devices may
> +	 * still attempt to post interrupts.  The posted interrupt
> +	 * vector will cause a vmexit and the subsequent entry will
> +	 * call sync_pir_to_irr.
> +	 */
> +	if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
> +		vmx_set_rvi(max_irr);
> +	else if (max_irr_updated)
> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
>  	return max_irr;
>  }
>  
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv
  2021-11-23  0:43 ` [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv Paolo Bonzini
  2021-11-29 23:39   ` David Matlack
@ 2021-11-30  7:58   ` Maxim Levitsky
  1 sibling, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2021-11-30  7:58 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, stable

On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
> if APICv is disabled on the vCPU that receives it.  In that case, the
> interrupt will just cause a vmexit and leave the ON bit set together
> with the PIR bit corresponding to the interrupt.
> 
> Right now, the interrupt would not be delivered until APICv is re-enabled.
> However, fixing this is just a matter of always doing the PIR->IRR
> synchronization, even if the vCPU has temporarily disabled APICv.
> 
> This is not a problem for performance, or if anything it is an
> improvement.  First, in the common case where vcpu->arch.apicv_active is
> true, one fewer check has to be performed.  Second, static_call_cond will
> elide the function call if APICv is not present or disabled.  Finally,
> in the case for AMD hardware we can remove the sync_pir_to_irr callback:
> it is only needed for apic_has_interrupt_for_ppr, and that function
> already has a fallback for !APICv.
> 
> Cc: stable@vger.kernel.org
> Co-developed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/lapic.c   |  2 +-
>  arch/x86/kvm/svm/svm.c |  1 -
>  arch/x86/kvm/x86.c     | 18 +++++++++---------
>  3 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..f206fc35deff 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
>  static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
>  {
>  	int highest_irr;
> -	if (apic->vcpu->arch.apicv_active)
> +	if (kvm_x86_ops.sync_pir_to_irr)
>  		highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
>  	else
>  		highest_irr = apic_find_highest_irr(apic);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5630c241d5f6..d0f68d11ec70 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.load_eoi_exitmap = svm_load_eoi_exitmap,
>  	.hwapic_irr_update = svm_hwapic_irr_update,
>  	.hwapic_isr_update = svm_hwapic_isr_update,
> -	.sync_pir_to_irr = kvm_lapic_find_highest_irr,
>  	.apicv_post_state_restore = avic_post_state_restore,
>  
>  	.set_tss_addr = svm_set_tss_addr,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 441f4769173e..a8f12c83db4b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>  				    struct kvm_lapic_state *s)
>  {
> -	if (vcpu->arch.apicv_active)
> -		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +	static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  	return kvm_apic_get_state(vcpu, s);
>  }
> @@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>  	if (irqchip_split(vcpu->kvm))
>  		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
>  	else {
> -		if (vcpu->arch.apicv_active)
> -			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  		if (ioapic_in_kernel(vcpu->kvm))
>  			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>  	}
> @@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  	/*
>  	 * This handles the case where a posted interrupt was
> -	 * notified with kvm_vcpu_kick.
> +	 * notified with kvm_vcpu_kick.  Assigned devices can
> +	 * use the POSTED_INTR_VECTOR even if APICv is disabled,
> +	 * so do it even if APICv is disabled on this vCPU.
>  	 */
> -	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> -		static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +	if (kvm_lapic_enabled(vcpu))
> +		static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  	if (kvm_vcpu_exit_request(vcpu)) {
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
> @@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
>  			break;
>  
> -		if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> -			static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> +		if (kvm_lapic_enabled(vcpu))
> +			static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>  
>  		if (unlikely(kvm_vcpu_exit_request(vcpu))) {
>  			exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths
  2021-11-23  0:43 ` [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths Paolo Bonzini
  2021-11-29 23:41   ` David Matlack
@ 2021-11-30  8:05   ` Maxim Levitsky
  1 sibling, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2021-11-30  8:05 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, stable

On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> Currently, checks for whether VT-d PI can be used refer to the current
> status of the feature in the current vCPU; or they more or less pick
> vCPU 0 in case a specific vCPU is not available.
> 
> However, these checks do not attempt to synchronize with changes to
> the IRTE.  In particular, there is no path that updates the IRTE when
> APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
> that has APICv disabled, if the wakeup occurs because of an IRTE
> that points to a posted interrupt.
> 
> To fix this, always go through the VT-d PI path as long as there are
> assigned devices and APICv is available on both the host and the VM side.
> Since the relevant condition was copied over three times, take the hint
> and factor it into a separate function.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..1c94783b5a54 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -5,6 +5,7 @@
>  #include <asm/cpu.h>
>  
>  #include "lapic.h"
> +#include "irq.h"
>  #include "posted_intr.h"
>  #include "trace.h"
>  #include "vmx.h"
> @@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>  		pi_set_on(pi_desc);
>  }
>  
> +static bool vmx_can_use_vtd_pi(struct kvm *kvm)
> +{
> +	return irqchip_in_kernel(kvm) && enable_apicv &&
> +		kvm_arch_has_assigned_device(kvm) &&
> +		irq_remapping_cap(IRQ_POSTING_CAP);
> +}
> +
>  void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
>  {
>  	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>  
> -	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> -		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
> -		!kvm_vcpu_apicv_active(vcpu))
> +	if (!vmx_can_use_vtd_pi(vcpu->kvm))
>  		return;
>  
>  	/* Set SN when the vCPU is preempted */
> @@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
>  	struct pi_desc old, new;
>  	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>  
> -	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> -		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
> -		!kvm_vcpu_apicv_active(vcpu))
> +	if (!vmx_can_use_vtd_pi(vcpu->kvm))
>  		return 0;
>  
>  	WARN_ON(irqs_disabled());
> @@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
>  	struct vcpu_data vcpu_info;
>  	int idx, ret = 0;
>  
> -	if (!kvm_arch_has_assigned_device(kvm) ||
> -	    !irq_remapping_cap(IRQ_POSTING_CAP) ||
> -	    !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> +	if (!vmx_can_use_vtd_pi(kvm))
>  		return 0;
>  
>  	idx = srcu_read_lock(&kvm->irq_srcu);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  2021-11-29 22:14   ` Sean Christopherson
@ 2021-11-30  8:31     ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-11-30  8:31 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, stable

On 11/29/21 23:14, Sean Christopherson wrote:
> Heh, maybe s/max_irr_updated/new_pir_found or so?  This is a bit weird:
> 
>    1. Update max_irr
>    2. max_irr_updated = false

Sounds good (I went for got_posted_interrupt).

>>   	}
>> -	vmx_hwapic_irr_update(vcpu, max_irr);
>> +
>> +	/*
>> +	 * If virtual interrupt delivery is not in use, the interrupt
>> +	 * will be processed via KVM_REQ_EVENT, not RVI.  This can happen
> 
> I'd strongly prefer to phrase this as a command, e.g. "process the interrupt via
> KVM_REQ_EVENT".  "will be processed" makes it sound like some other flow is
> handling the event, which confused me.

What I wanted to convey is that the interrupt is not processed yet, and 
the vmentry might have to be canceled.  I changed it to

          * Newly recognized interrupts are injected via either virtual 
interrupt
          * delivery (RVI) or KVM_REQ_EVENT.  Virtual interrupt delivery is
          * disabled in two cases:

> 	 * 1) If L2 is running and the vCPU has a new pending interrupt.  If L1
> 	 * wants to exit on interrupts, KVM_REQ_EVENT is needed to synthesize a
> 	 * VM-Exit to L1.  If L1 doesn't want to exit, the interrupt is injected
> 	 * into L2, but KVM doesn't use virtual interrupt delivery to inject
> 	 * interrupts into L2, and so KVM_REQ_EVENT is again needed.
> 	 *
> 	 * 2) If APICv is disabled for this vCPU, assigned devices may still
> 	 * attempt to post interrupts.  The posted interrupt vector will cause
> 	 * a VM-Exit and the subsequent entry will call sync_pir_to_irr.
> 	 */

Applied these.

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-30  8:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23  0:43 [PATCH 0/4] KVM: VMX: process posted interrupts on APICv-disable vCPUs Paolo Bonzini
2021-11-23  0:43 ` [PATCH 1/4] KVM: x86: ignore APICv if LAPIC is not enabled Paolo Bonzini
2021-11-25  1:36   ` Sean Christopherson
2021-11-29 23:32   ` David Matlack
2021-11-30  7:50   ` Maxim Levitsky
2021-11-23  0:43 ` [PATCH 2/4] KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled Paolo Bonzini
2021-11-29 22:14   ` Sean Christopherson
2021-11-30  8:31     ` Paolo Bonzini
2021-11-29 23:34   ` David Matlack
2021-11-30  7:57   ` Maxim Levitsky
2021-11-23  0:43 ` [PATCH 3/4] KVM: x86: check PIR even for vCPUs with disabled APICv Paolo Bonzini
2021-11-29 23:39   ` David Matlack
2021-11-30  7:58   ` Maxim Levitsky
2021-11-23  0:43 ` [PATCH 4/4] KVM: x86: Use a stable condition around all VT-d PI paths Paolo Bonzini
2021-11-29 23:41   ` David Matlack
2021-11-30  8:05   ` Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).