linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] KVM: x86: Event fixes and cleanup
@ 2020-04-23  2:25 Sean Christopherson
  2020-04-23  2:25 ` [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior Sean Christopherson
                   ` (12 more replies)
  0 siblings, 13 replies; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Most of this series only really affects nVMX, but there are a few x86
changes sprinkled in.

Patches 1 and 2 are alternative fixes[1][2] for bugs where a #DB destined
for L2 is dropped because a lower priority event, e.g. VMX preemption
timer, is serviced and triggers VM-Exit, and where correctly handling the
#DB can result in the preemption timer being dropped.

Patch 3 fixes a semi-theoretical bug.  I've been intermittently observing
failures when running the preemption timer unit test in L1, but have never
been able to consistently reproduce the bug.  I suspect the issue is
KVM_REQ_EVENT being lost, but can't really confirm this is the case due to
lack of a reproducer.

Patches 4-7 are cleanup/refactoring to fix non-exiting NMI/INTR priority
bugs (similar to above) in patch 8.  Although patch 8 is technically a bug
fix, I don't think it's stable material (no sane L1 will notice), which is
why I prioritized (da-dum ching) a clean implementation over an easily
backported patch (a single patch would have been ugly).

Patch 9 fixes a similar issue with SMI priority, and again is probably not
stable material.

Patch 10 addresses a gap in WARN coverage that's effectively introduced
by the bug fix in patch 1.

Patches 11 and 12 replace the extra call to check_nested_events() with a
more precise hack-a-fix.  This is a very small step towards a pipe dream
of processing each event class exactly once per run loop (more below).

Patch 13 is a random optimization that caught my eye when starting at this
code over and over.


I really, really dislike KVM's event handling flow.  In the (distant)
future I'd love to rework the event injection to process each event
exactly once per loop, as opposed to the current behavior where
check_nested_events() can be called at least twice, if not more depending
on blocking behavior.  That would make it much cleaner to correctly handle
event prioritization and likely to maintain the code, but getting there is
a significant rework with a fair number of scary changes.

[1] https://lkml.kernel.org/r/20200414000946.47396-2-jmattson@google.com
[2] https://lkml.kernel.org/r/20200414000946.47396-1-jmattson@google.com

Sean Christopherson (13):
  KVM: nVMX: Preserve exception priority irrespective of exiting
    behavior
  KVM: nVMX: Open a window for pending nested VMX preemption timer
  KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit
    set
  KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of
    int
  KVM: nVMX: Move nested_exit_on_nmi() to nested.h
  KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set
  KVM: VMX: Split out architectural interrupt/NMI blocking checks
  KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior
  KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  KVM: x86: WARN on injected+pending exception even in nested case
  KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit()
  KVM: x86: Replace late check_nested_events() hack with more precise
    fix
  KVM: VMX: Use vmx_get_rflags() to query RFLAGS in
    vmx_interrupt_blocked()

 arch/x86/include/asm/kvm_host.h |  6 ++-
 arch/x86/kvm/svm/svm.c          | 10 ++--
 arch/x86/kvm/vmx/nested.c       | 42 +++++++++++------
 arch/x86/kvm/vmx/nested.h       |  5 ++
 arch/x86/kvm/vmx/vmx.c          | 84 +++++++++++++++++++++------------
 arch/x86/kvm/vmx/vmx.h          |  2 +
 arch/x86/kvm/x86.c              | 32 +++++--------
 7 files changed, 113 insertions(+), 68 deletions(-)

-- 
2.26.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 18:54   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer Sean Christopherson
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Short circuit vmx_check_nested_events() if an exception is pending and
needs to be injected into L2, priority between coincident events is not
dependent on exiting behavior.  This fixes a bug where a single-step #DB
that is not intercepted by L1 is incorrectly dropped due to servicing a
VMX Preemption Timer VM-Exit.

Injected exceptions also need to be blocked if nested VM-Enter is
pending or an exception was already injected, otherwise injecting the
exception could overwrite an existing event injection from L1.
Technically, this scenario should be impossible, i.e. KVM shouldn't
inject its own exception during nested VM-Enter.  This will be addressed
in a future patch.

Note, event priority between SMI, NMI and INTR is incorrect for L2, e.g.
SMI should take priority over VM-Exit on NMI/INTR, and NMI that is
injected into L2 should take priority over VM-Exit INTR.  This will also
be addressed in a future patch.

Fixes: b6b8a1451fc4 ("KVM: nVMX: Rework interception of IRQs and NMIs")
Reported-by: Jim Mattson <jmattson@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f228339cd0a0..dc7315b31fee 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3716,11 +3716,11 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 	/*
 	 * Process any exceptions that are not debug traps before MTF.
 	 */
-	if (vcpu->arch.exception.pending &&
-	    !vmx_pending_dbg_trap(vcpu) &&
-	    nested_vmx_check_exception(vcpu, &exit_qual)) {
+	if (vcpu->arch.exception.pending && !vmx_pending_dbg_trap(vcpu)) {
 		if (block_nested_events)
 			return -EBUSY;
+		if (!nested_vmx_check_exception(vcpu, &exit_qual))
+			goto no_vmexit;
 		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
 		return 0;
 	}
@@ -3733,10 +3733,11 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	if (vcpu->arch.exception.pending &&
-	    nested_vmx_check_exception(vcpu, &exit_qual)) {
+	if (vcpu->arch.exception.pending) {
 		if (block_nested_events)
 			return -EBUSY;
+		if (!nested_vmx_check_exception(vcpu, &exit_qual))
+			goto no_vmexit;
 		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
 		return 0;
 	}
@@ -3771,6 +3772,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
+no_vmexit:
 	vmx_complete_nested_posted_interrupt(vcpu);
 	return 0;
 }
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
  2020-04-23  2:25 ` [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:39   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set Sean Christopherson
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Add a kvm_x86_ops hook to detect a nested pending "hypervisor timer" and
use it to effectively open a window for servicing the expired timer.
Like pending SMIs on VMX, opening a window simply means requesting an
immediate exit.

This fixes a bug where an expired VMX preemption timer (for L2) will be
delayed and/or lost if a pending exception is injected into L2.  The
pending exception is rightly prioritized by vmx_check_nested_events()
and injected into L2, with the preemption timer left pending.  Because
no window opened, L2 is free to run uninterrupted.

Fixes: f4124500c2c13 ("KVM: nVMX: Fully emulate preemption timer")
Reported-by: Jim Mattson <jmattson@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/nested.c       | 10 ++++++++--
 arch/x86/kvm/x86.c              |  4 ++++
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f26df2cb0591..65dc2c88d8b2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1179,6 +1179,7 @@ struct kvm_x86_ops {
 	void (*handle_exit_irqoff)(struct kvm_vcpu *vcpu);
 
 	int (*check_nested_events)(struct kvm_vcpu *vcpu);
+	bool (*nested_hv_timer_pending)(struct kvm_vcpu *vcpu);
 	void (*request_immediate_exit)(struct kvm_vcpu *vcpu);
 
 	void (*sched_in)(struct kvm_vcpu *kvm, int cpu);
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index dc7315b31fee..63cf339a13ac 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3687,6 +3687,12 @@ static void nested_vmx_update_pending_dbg(struct kvm_vcpu *vcpu)
 			    vcpu->arch.exception.payload);
 }
 
+static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu)
+{
+	return nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
+	       to_vmx(vcpu)->nested.preemption_timer_expired;
+}
+
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3742,8 +3748,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
-	    vmx->nested.preemption_timer_expired) {
+	if (nested_vmx_preemption_timer_pending(vcpu)) {
 		if (block_nested_events)
 			return -EBUSY;
 		nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
@@ -6443,6 +6448,7 @@ __init int nested_vmx_hardware_setup(struct kvm_x86_ops *ops,
 	exit_handlers[EXIT_REASON_VMFUNC]	= handle_vmfunc;
 
 	ops->check_nested_events = vmx_check_nested_events;
+	ops->nested_hv_timer_pending = nested_vmx_preemption_timer_pending;
 	ops->get_nested_state = vmx_get_nested_state;
 	ops->set_nested_state = vmx_set_nested_state;
 	ops->get_vmcs12_pages = nested_get_vmcs12_pages;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 59958ce2b681..ecd612807546 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8324,6 +8324,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 				kvm_x86_ops.enable_nmi_window(vcpu);
 			if (kvm_cpu_has_injectable_intr(vcpu) || req_int_win)
 				kvm_x86_ops.enable_irq_window(vcpu);
+			if (is_guest_mode(vcpu) &&
+			    kvm_x86_ops.nested_hv_timer_pending &&
+			    kvm_x86_ops.nested_hv_timer_pending(vcpu))
+				req_immediate_exit = true;
 			WARN_ON(vcpu->arch.exception.pending);
 		}
 
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
  2020-04-23  2:25 ` [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior Sean Christopherson
  2020-04-23  2:25 ` [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:41   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int Sean Christopherson
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Re-request KVM_REQ_EVENT if vcpu_enter_guest() bails after processing
pending requests and an immediate exit was requested.  This fixes a bug
where a pending event, e.g. VMX preemption timer, is delayed and/or lost
if the exit was deferred due to something other than a higher priority
_injected_ event, e.g. due to a pending nested VM-Enter.  This bug only
affects the !injected case as kvm_x86_ops.cancel_injection() sets
KVM_REQ_EVENT to redo the injection, but that's purely serendipitous
behavior with respect to the deferred event.

Note, emulated preemption timer isn't the only event that can be
affected, it simply happens to be the only event where not re-requesting
KVM_REQ_EVENT is blatantly visible to the guest.

Fixes: f4124500c2c13 ("KVM: nVMX: Fully emulate preemption timer")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/x86.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ecd612807546..6af873b7e0ae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8489,6 +8489,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	return r;
 
 cancel_injection:
+	if (req_immediate_exit)
+		kvm_make_request(KVM_REQ_EVENT, vcpu);
 	kvm_x86_ops.cancel_injection(vcpu);
 	if (unlikely(vcpu->arch.apic_attention))
 		kvm_lapic_sync_from_vapic(vcpu);
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:42   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h Sean Christopherson
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Return an actual bool for kvm_x86_ops' {interrupt_nmi}_allowed() hook to
better reflect the return semantics, and to avoid creating an even
bigger mess when the related VMX code is refactored in upcoming patches.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 4 ++--
 arch/x86/kvm/svm/svm.c          | 9 +++++----
 arch/x86/kvm/vmx/vmx.c          | 8 ++++----
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 65dc2c88d8b2..787636acd648 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1139,8 +1139,8 @@ struct kvm_x86_ops {
 	void (*set_nmi)(struct kvm_vcpu *vcpu);
 	void (*queue_exception)(struct kvm_vcpu *vcpu);
 	void (*cancel_injection)(struct kvm_vcpu *vcpu);
-	int (*interrupt_allowed)(struct kvm_vcpu *vcpu);
-	int (*nmi_allowed)(struct kvm_vcpu *vcpu);
+	bool (*interrupt_allowed)(struct kvm_vcpu *vcpu);
+	bool (*nmi_allowed)(struct kvm_vcpu *vcpu);
 	bool (*get_nmi_mask)(struct kvm_vcpu *vcpu);
 	void (*set_nmi_mask)(struct kvm_vcpu *vcpu, bool masked);
 	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index eb95283ab68d..f21f734861dd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3062,11 +3062,12 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 		set_cr_intercept(svm, INTERCEPT_CR8_WRITE);
 }
 
-static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
+static bool svm_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb;
-	int ret;
+	bool ret;
+
 	ret = !(vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) &&
 	      !(svm->vcpu.arch.hflags & HF_NMI_MASK);
 	ret = ret && gif_set(svm) && nested_svm_nmi(svm);
@@ -3094,14 +3095,14 @@ static void svm_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 	}
 }
 
-static int svm_interrupt_allowed(struct kvm_vcpu *vcpu)
+static bool svm_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb;
 
 	if (!gif_set(svm) ||
 	     (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK))
-		return 0;
+		return false;
 
 	if (is_guest_mode(vcpu) && (svm->vcpu.arch.hflags & HF_VINTR_MASK))
 		return !!(svm->vcpu.arch.hflags & HF_HIF_MASK);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 766303b31949..7dd42e7fef94 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4511,21 +4511,21 @@ void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 	}
 }
 
-static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
+static bool vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	if (to_vmx(vcpu)->nested.nested_run_pending)
-		return 0;
+		return false;
 
 	if (!enable_vnmi &&
 	    to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
-		return 0;
+		return false;
 
 	return	!(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
 		  (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
 		   | GUEST_INTR_STATE_NMI));
 }
 
-static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
+static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
 	if (to_vmx(vcpu)->nested.nested_run_pending)
 		return false;
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (3 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:44   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set Sean Christopherson
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Expose nested_exit_on_nmi() for use by vmx_nmi_allowed() in a future
patch.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 5 -----
 arch/x86/kvm/vmx/nested.h | 5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 63cf339a13ac..40b2427f35b7 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -698,11 +698,6 @@ static bool nested_exit_intr_ack_set(struct kvm_vcpu *vcpu)
 		VM_EXIT_ACK_INTR_ON_EXIT;
 }
 
-static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
-{
-	return nested_cpu_has_nmi_exiting(get_vmcs12(vcpu));
-}
-
 static int nested_vmx_check_apic_access_controls(struct kvm_vcpu *vcpu,
 					  struct vmcs12 *vmcs12)
 {
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 1514ff4db77f..7d7475549b9f 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -225,6 +225,11 @@ static inline bool nested_cpu_has_save_preemption_timer(struct vmcs12 *vmcs12)
 	    VM_EXIT_SAVE_VMX_PREEMPTION_TIMER;
 }
 
+static inline bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
+{
+	return nested_cpu_has_nmi_exiting(get_vmcs12(vcpu));
+}
+
 /*
  * In nested virtualization, check if L1 asked to exit on external interrupts.
  * For most existing hypervisors, this will always return true.
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (4 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:46   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks Sean Christopherson
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Report NMIs as allowed when the vCPU is in L2 and L2 is being run with
Exit-on-NMI enabled, as NMIs are always unblocked from L1's perspective
in this case.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7dd42e7fef94..c7bb9d90d441 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4516,6 +4516,9 @@ static bool vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 	if (to_vmx(vcpu)->nested.nested_run_pending)
 		return false;
 
+	if (is_guest_mode(vcpu) && nested_exit_on_nmi(vcpu))
+		return true;
+
 	if (!enable_vnmi &&
 	    to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
 		return false;
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (5 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:57   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior Sean Christopherson
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Move the architectural (non-KVM specific) interrupt/NMI blocking checks
to a separate helper so that they can be used in a future patch by
vmx_check_nested_events().

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 41 +++++++++++++++++++++++++----------------
 arch/x86/kvm/vmx/vmx.h |  2 ++
 2 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c7bb9d90d441..50c726a21feb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4511,34 +4511,43 @@ void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 	}
 }
 
-static bool vmx_nmi_allowed(struct kvm_vcpu *vcpu)
+bool vmx_nmi_blocked(struct kvm_vcpu *vcpu)
 {
-	if (to_vmx(vcpu)->nested.nested_run_pending)
-		return false;
-
 	if (is_guest_mode(vcpu) && nested_exit_on_nmi(vcpu))
+		return false;
+
+	if (!enable_vnmi && to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
 		return true;
 
-	if (!enable_vnmi &&
-	    to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
-		return false;
-
-	return	!(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
-		  (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
-		   | GUEST_INTR_STATE_NMI));
+	return (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
+		(GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI |
+		 GUEST_INTR_STATE_NMI));
 }
 
-static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
+static bool vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	if (to_vmx(vcpu)->nested.nested_run_pending)
 		return false;
 
+	return !vmx_nmi_blocked(vcpu);
+}
+
+bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu)
+{
 	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
-		return true;
+		return false;
 
-	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
-		!(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
-			(GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
+	return !(vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) ||
+	       (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
+		(GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
+}
+
+static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
+{
+	if (to_vmx(vcpu)->nested.nested_run_pending)
+		return false;
+
+	return !vmx_interrupt_blocked(vcpu);
 }
 
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index edfb739e5907..b5e773267abe 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -344,6 +344,8 @@ void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
 u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa);
 void update_exception_bitmap(struct kvm_vcpu *vcpu);
 void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
+bool vmx_nmi_blocked(struct kvm_vcpu *vcpu);
+bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu);
 bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu);
 void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
 void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (6 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 21:58   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI Sean Christopherson
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Short circuit vmx_check_nested_events() if an unblocked IRQ/NMI is
pending and needs to be injected into L2, priority between coincident
events is not dependent on exiting behavior.

Fixes: b6b8a1451fc4 ("KVM: nVMX: Rework interception of IRQs and NMIs")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 40b2427f35b7..1fdaca5fd93d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3750,9 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) {
+	if (vcpu->arch.nmi_pending && !vmx_nmi_blocked(vcpu)) {
 		if (block_nested_events)
 			return -EBUSY;
+		if (!nested_exit_on_nmi(vcpu))
+			goto no_vmexit;
+
 		nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
 				  NMI_VECTOR | INTR_TYPE_NMI_INTR |
 				  INTR_INFO_VALID_MASK, 0);
@@ -3765,9 +3768,11 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	if (kvm_cpu_has_interrupt(vcpu) && nested_exit_on_intr(vcpu)) {
+	if (kvm_cpu_has_interrupt(vcpu) && !vmx_interrupt_blocked(vcpu)) {
 		if (block_nested_events)
 			return -EBUSY;
+		if (!nested_exit_on_intr(vcpu))
+			goto no_vmexit;
 		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
 		return 0;
 	}
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (7 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 22:04   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case Sean Christopherson
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Check for an unblocked SMI in vmx_check_nested_events() so that pending
SMIs are correctly prioritized over IRQs and NMIs when the latter events
will trigger VM-Exit.  This also fixes an issue where an SMI that was
marked pending while processing a nested VM-Enter wouldn't trigger an
immediate exit, i.e. would be incorrectly delayed until L2 happened to
take a VM-Exit.

Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1fdaca5fd93d..8c16b190816b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
+	if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
+		if (block_nested_events)
+			return -EBUSY;
+		goto no_vmexit;
+	}
+
 	if (vcpu->arch.nmi_pending && !vmx_nmi_blocked(vcpu)) {
 		if (block_nested_events)
 			return -EBUSY;
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (8 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 22:05   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit() Sean Christopherson
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

WARN if a pending exception is coincident with an injected exception
before calling check_nested_events() so that the WARN will fire even if
inject_pending_event() bails early because check_nested_events() detects
the conflict.  Bailing early isn't problematic (quite the opposite), but
suppressing the WARN is undesirable as it could mask a bug elsewhere in
KVM.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/x86.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6af873b7e0ae..7c49a7dc601f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7693,6 +7693,9 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
 			kvm_x86_ops.set_irq(vcpu);
 	}
 
+	WARN_ON_ONCE(vcpu->arch.exception.injected &&
+		     vcpu->arch.exception.pending);
+
 	/*
 	 * Call check_nested_events() even if we reinjected a previous event
 	 * in order for caller to determine if it should require immediate-exit
@@ -7711,7 +7714,6 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
 					vcpu->arch.exception.has_error_code,
 					vcpu->arch.exception.error_code);
 
-		WARN_ON_ONCE(vcpu->arch.exception.injected);
 		vcpu->arch.exception.pending = false;
 		vcpu->arch.exception.injected = true;
 
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit()
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (9 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 22:07   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix Sean Christopherson
  2020-04-23  2:25 ` [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked() Sean Christopherson
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Use vmx_interrupt_blocked() instead of bouncing through
vmx_interrupt_allowed() when handling edge cases in vmx_handle_exit().
The nested_run_pending check in vmx_interrupt_allowed() should never
evaluate true in the VM-Exit path.

Hoist the WARN in handle_invalid_guest_state() up to vmx_handle_exit()
to enforce the above assumption for the !enable_vnmi case, and to detect
any other potential bugs with nested VM-Enter.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 50c726a21feb..2f8cacb3aa9b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5268,18 +5268,11 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
 	bool intr_window_requested;
 	unsigned count = 130;
 
-	/*
-	 * We should never reach the point where we are emulating L2
-	 * due to invalid guest state as that means we incorrectly
-	 * allowed a nested VMEntry with an invalid vmcs12.
-	 */
-	WARN_ON_ONCE(vmx->emulation_required && vmx->nested.nested_run_pending);
-
 	intr_window_requested = exec_controls_get(vmx) &
 				CPU_BASED_INTR_WINDOW_EXITING;
 
 	while (vmx->emulation_required && count-- != 0) {
-		if (intr_window_requested && vmx_interrupt_allowed(vcpu))
+		if (intr_window_requested && !vmx_interrupt_blocked(vcpu))
 			return handle_interrupt_window(&vmx->vcpu);
 
 		if (kvm_test_request(KVM_REQ_EVENT, vcpu))
@@ -5896,6 +5889,14 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu,
 	if (enable_pml)
 		vmx_flush_pml_buffer(vcpu);
 
+	/*
+	 * We should never reach this point with a pending nested VM-Enter, and
+	 * more specifically emulation of L2 due to invalid guest state (see
+	 * below) should never happen as that means we incorrectly allowed a
+	 * nested VM-Enter with an invalid vmcs12.
+	 */
+	WARN_ON_ONCE(vmx->nested.nested_run_pending);
+
 	/* If guest state is invalid, start emulating */
 	if (vmx->emulation_required)
 		return handle_invalid_guest_state(vcpu);
@@ -5962,7 +5963,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu,
 
 	if (unlikely(!enable_vnmi &&
 		     vmx->loaded_vmcs->soft_vnmi_blocked)) {
-		if (vmx_interrupt_allowed(vcpu)) {
+		if (!vmx_interrupt_blocked(vcpu)) {
 			vmx->loaded_vmcs->soft_vnmi_blocked = 0;
 		} else if (vmx->loaded_vmcs->vnmi_blocked_time > 1000000000LL &&
 			   vcpu->arch.nmi_pending) {
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (10 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit() Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-23 11:00   ` Paolo Bonzini
  2020-04-28 22:12   ` Jim Mattson
  2020-04-23  2:25 ` [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked() Sean Christopherson
  12 siblings, 2 replies; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Add a separate hook for checking if interrupt injection is blocked and
use the hook to handle the case where an interrupt arrives between
check_nested_events() and the injection logic.  Drop the retry of
check_nested_events() that hack-a-fixed the same condition.

Blocking injection is also a bit of a hack, e.g. KVM should do exiting
and non-exiting interrupt processing in a single pass, but it's a more
precise hack.  The old comment is also misleading, e.g. KVM_REQ_EVENT is
purely an optimization, setting it on every run loop (which KVM doesn't
do) should not affect functionality, only performance.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm/svm.c          |  1 +
 arch/x86/kvm/vmx/vmx.c          | 13 +++++++++++++
 arch/x86/kvm/x86.c              | 22 ++++------------------
 4 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 787636acd648..16fdeddb4a65 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1140,6 +1140,7 @@ struct kvm_x86_ops {
 	void (*queue_exception)(struct kvm_vcpu *vcpu);
 	void (*cancel_injection)(struct kvm_vcpu *vcpu);
 	bool (*interrupt_allowed)(struct kvm_vcpu *vcpu);
+	bool (*interrupt_injection_allowed)(struct kvm_vcpu *vcpu);
 	bool (*nmi_allowed)(struct kvm_vcpu *vcpu);
 	bool (*get_nmi_mask)(struct kvm_vcpu *vcpu);
 	void (*set_nmi_mask)(struct kvm_vcpu *vcpu, bool masked);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f21f734861dd..6d3ccbfc9e6a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3993,6 +3993,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.queue_exception = svm_queue_exception,
 	.cancel_injection = svm_cancel_injection,
 	.interrupt_allowed = svm_interrupt_allowed,
+	.interrupt_injection_allowed = svm_interrupt_allowed,
 	.nmi_allowed = svm_nmi_allowed,
 	.get_nmi_mask = svm_get_nmi_mask,
 	.set_nmi_mask = svm_set_nmi_mask,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2f8cacb3aa9b..68b3748b5383 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4550,6 +4550,18 @@ static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 	return !vmx_interrupt_blocked(vcpu);
 }
 
+static bool vmx_interrupt_injection_allowed(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * An IRQ must not be injected into L2 if it's supposed to VM-Exit,
+	 * e.g. if the IRQ arrived asynchronously after checking nested events.
+	 */
+	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
+		return false;
+
+	return vmx_interrupt_allowed(vcpu);
+}
+
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
 {
 	int ret;
@@ -7823,6 +7835,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.queue_exception = vmx_queue_exception,
 	.cancel_injection = vmx_cancel_injection,
 	.interrupt_allowed = vmx_interrupt_allowed,
+	.interrupt_injection_allowed = vmx_interrupt_injection_allowed,
 	.nmi_allowed = vmx_nmi_allowed,
 	.get_nmi_mask = vmx_get_nmi_mask,
 	.set_nmi_mask = vmx_set_nmi_mask,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7c49a7dc601f..d9d6028a77e0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7755,24 +7755,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
 		--vcpu->arch.nmi_pending;
 		vcpu->arch.nmi_injected = true;
 		kvm_x86_ops.set_nmi(vcpu);
-	} else if (kvm_cpu_has_injectable_intr(vcpu)) {
-		/*
-		 * Because interrupts can be injected asynchronously, we are
-		 * calling check_nested_events again here to avoid a race condition.
-		 * See https://lkml.org/lkml/2014/7/2/60 for discussion about this
-		 * proposal and current concerns.  Perhaps we should be setting
-		 * KVM_REQ_EVENT only on certain events and not unconditionally?
-		 */
-		if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
-			r = kvm_x86_ops.check_nested_events(vcpu);
-			if (r != 0)
-				return r;
-		}
-		if (kvm_x86_ops.interrupt_allowed(vcpu)) {
-			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
-					    false);
-			kvm_x86_ops.set_irq(vcpu);
-		}
+	} else if (kvm_cpu_has_injectable_intr(vcpu) &&
+		   kvm_x86_ops.interrupt_injection_allowed(vcpu)) {
+		kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
+		kvm_x86_ops.set_irq(vcpu);
 	}
 
 	return 0;
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked()
  2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
                   ` (11 preceding siblings ...)
  2020-04-23  2:25 ` [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix Sean Christopherson
@ 2020-04-23  2:25 ` Sean Christopherson
  2020-04-28 22:13   ` Jim Mattson
  12 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-23  2:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Oliver Upton, Peter Shier

Use vmx_get_rflags() instead of manually reading vmcs.GUEST_RFLAGS when
querying RFLAGS.IF so that multiple checks against interrupt blocking in
a single run loop only require a single VMREAD.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 68b3748b5383..9c4dd481b588 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4537,7 +4537,7 @@ bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu)
 	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
 		return false;
 
-	return !(vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) ||
+	return !(vmx_get_rflags(vcpu) & X86_EFLAGS_IF) ||
 	       (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
 		(GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
 }
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-23  2:25 ` [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix Sean Christopherson
@ 2020-04-23 11:00   ` Paolo Bonzini
  2020-04-28 22:12   ` Jim Mattson
  1 sibling, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2020-04-23 11:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Oliver Upton, Peter Shier

On 23/04/20 04:25, Sean Christopherson wrote:
> Add a separate hook for checking if interrupt injection is blocked and
> use the hook to handle the case where an interrupt arrives between
> check_nested_events() and the injection logic.  Drop the retry of
> check_nested_events() that hack-a-fixed the same condition.
> 
> Blocking injection is also a bit of a hack, e.g. KVM should do exiting
> and non-exiting interrupt processing in a single pass, but it's a more
> precise hack.  The old comment is also misleading, e.g. KVM_REQ_EVENT is
> purely an optimization, setting it on every run loop (which KVM doesn't
> do) should not affect functionality, only performance.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/svm/svm.c          |  1 +
>  arch/x86/kvm/vmx/vmx.c          | 13 +++++++++++++
>  arch/x86/kvm/x86.c              | 22 ++++------------------
>  4 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 787636acd648..16fdeddb4a65 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1140,6 +1140,7 @@ struct kvm_x86_ops {
>  	void (*queue_exception)(struct kvm_vcpu *vcpu);
>  	void (*cancel_injection)(struct kvm_vcpu *vcpu);
>  	bool (*interrupt_allowed)(struct kvm_vcpu *vcpu);
> +	bool (*interrupt_injection_allowed)(struct kvm_vcpu *vcpu);
>  	bool (*nmi_allowed)(struct kvm_vcpu *vcpu);
>  	bool (*get_nmi_mask)(struct kvm_vcpu *vcpu);
>  	void (*set_nmi_mask)(struct kvm_vcpu *vcpu, bool masked);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index f21f734861dd..6d3ccbfc9e6a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3993,6 +3993,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.queue_exception = svm_queue_exception,
>  	.cancel_injection = svm_cancel_injection,
>  	.interrupt_allowed = svm_interrupt_allowed,
> +	.interrupt_injection_allowed = svm_interrupt_allowed,
>  	.nmi_allowed = svm_nmi_allowed,
>  	.get_nmi_mask = svm_get_nmi_mask,
>  	.set_nmi_mask = svm_set_nmi_mask,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2f8cacb3aa9b..68b3748b5383 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4550,6 +4550,18 @@ static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>  	return !vmx_interrupt_blocked(vcpu);
>  }
>  
> +static bool vmx_interrupt_injection_allowed(struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * An IRQ must not be injected into L2 if it's supposed to VM-Exit,
> +	 * e.g. if the IRQ arrived asynchronously after checking nested events.
> +	 */
> +	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
> +		return false;
> +
> +	return vmx_interrupt_allowed(vcpu);
> +}
> +
>  static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
>  {
>  	int ret;
> @@ -7823,6 +7835,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.queue_exception = vmx_queue_exception,
>  	.cancel_injection = vmx_cancel_injection,
>  	.interrupt_allowed = vmx_interrupt_allowed,
> +	.interrupt_injection_allowed = vmx_interrupt_injection_allowed,
>  	.nmi_allowed = vmx_nmi_allowed,
>  	.get_nmi_mask = vmx_get_nmi_mask,
>  	.set_nmi_mask = vmx_set_nmi_mask,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7c49a7dc601f..d9d6028a77e0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7755,24 +7755,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
>  		--vcpu->arch.nmi_pending;
>  		vcpu->arch.nmi_injected = true;
>  		kvm_x86_ops.set_nmi(vcpu);
> -	} else if (kvm_cpu_has_injectable_intr(vcpu)) {
> -		/*
> -		 * Because interrupts can be injected asynchronously, we are
> -		 * calling check_nested_events again here to avoid a race condition.
> -		 * See https://lkml.org/lkml/2014/7/2/60 for discussion about this
> -		 * proposal and current concerns.  Perhaps we should be setting
> -		 * KVM_REQ_EVENT only on certain events and not unconditionally?
> -		 */
> -		if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
> -			r = kvm_x86_ops.check_nested_events(vcpu);
> -			if (r != 0)
> -				return r;
> -		}
> -		if (kvm_x86_ops.interrupt_allowed(vcpu)) {
> -			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
> -					    false);
> -			kvm_x86_ops.set_irq(vcpu);
> -		}
> +	} else if (kvm_cpu_has_injectable_intr(vcpu) &&
> +		   kvm_x86_ops.interrupt_injection_allowed(vcpu)) {
> +		kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
> +		kvm_x86_ops.set_irq(vcpu);

Hmm I'm interested in how this can help with AMD introducing another
instance of the late random check_nested_events.  I'll play with it.

Paolo


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior
  2020-04-23  2:25 ` [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior Sean Christopherson
@ 2020-04-28 18:54   ` Jim Mattson
  2020-04-28 20:07     ` Oliver Upton
  0 siblings, 1 reply; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 18:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Short circuit vmx_check_nested_events() if an exception is pending and
> needs to be injected into L2, priority between coincident events is not
> dependent on exiting behavior.  This fixes a bug where a single-step #DB
> that is not intercepted by L1 is incorrectly dropped due to servicing a
> VMX Preemption Timer VM-Exit.
>
> Injected exceptions also need to be blocked if nested VM-Enter is
> pending or an exception was already injected, otherwise injecting the
> exception could overwrite an existing event injection from L1.
> Technically, this scenario should be impossible, i.e. KVM shouldn't
> inject its own exception during nested VM-Enter.  This will be addressed
> in a future patch.
>
> Note, event priority between SMI, NMI and INTR is incorrect for L2, e.g.
> SMI should take priority over VM-Exit on NMI/INTR, and NMI that is
> injected into L2 should take priority over VM-Exit INTR.  This will also
> be addressed in a future patch.
>
> Fixes: b6b8a1451fc4 ("KVM: nVMX: Rework interception of IRQs and NMIs")
> Reported-by: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior
  2020-04-28 18:54   ` Jim Mattson
@ 2020-04-28 20:07     ` Oliver Upton
  0 siblings, 0 replies; 39+ messages in thread
From: Oliver Upton @ 2020-04-28 20:07 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML, Peter Shier

On Tue, Apr 28, 2020 at 11:54 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Short circuit vmx_check_nested_events() if an exception is pending and
> > needs to be injected into L2, priority between coincident events is not
> > dependent on exiting behavior.  This fixes a bug where a single-step #DB
> > that is not intercepted by L1 is incorrectly dropped due to servicing a
> > VMX Preemption Timer VM-Exit.
> >
> > Injected exceptions also need to be blocked if nested VM-Enter is
> > pending or an exception was already injected, otherwise injecting the
> > exception could overwrite an existing event injection from L1.
> > Technically, this scenario should be impossible, i.e. KVM shouldn't
> > inject its own exception during nested VM-Enter.  This will be addressed
> > in a future patch.
> >
> > Note, event priority between SMI, NMI and INTR is incorrect for L2, e.g.
> > SMI should take priority over VM-Exit on NMI/INTR, and NMI that is
> > injected into L2 should take priority over VM-Exit INTR.  This will also
> > be addressed in a future patch.
> >
> > Fixes: b6b8a1451fc4 ("KVM: nVMX: Rework interception of IRQs and NMIs")
> > Reported-by: Jim Mattson <jmattson@google.com>
> > Cc: Oliver Upton <oupton@google.com>
> > Cc: Peter Shier <pshier@google.com>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer
  2020-04-23  2:25 ` [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer Sean Christopherson
@ 2020-04-28 21:39   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:39 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Add a kvm_x86_ops hook to detect a nested pending "hypervisor timer" and
> use it to effectively open a window for servicing the expired timer.
> Like pending SMIs on VMX, opening a window simply means requesting an
> immediate exit.
>
> This fixes a bug where an expired VMX preemption timer (for L2) will be
> delayed and/or lost if a pending exception is injected into L2.  The
> pending exception is rightly prioritized by vmx_check_nested_events()
> and injected into L2, with the preemption timer left pending.  Because
> no window opened, L2 is free to run uninterrupted.
>
> Fixes: f4124500c2c13 ("KVM: nVMX: Fully emulate preemption timer")
> Reported-by: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set
  2020-04-23  2:25 ` [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set Sean Christopherson
@ 2020-04-28 21:41   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:41 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Re-request KVM_REQ_EVENT if vcpu_enter_guest() bails after processing
> pending requests and an immediate exit was requested.  This fixes a bug
> where a pending event, e.g. VMX preemption timer, is delayed and/or lost
> if the exit was deferred due to something other than a higher priority
> _injected_ event, e.g. due to a pending nested VM-Enter.  This bug only
> affects the !injected case as kvm_x86_ops.cancel_injection() sets
> KVM_REQ_EVENT to redo the injection, but that's purely serendipitous
> behavior with respect to the deferred event.
>
> Note, emulated preemption timer isn't the only event that can be
> affected, it simply happens to be the only event where not re-requesting
> KVM_REQ_EVENT is blatantly visible to the guest.
>
> Fixes: f4124500c2c13 ("KVM: nVMX: Fully emulate preemption timer")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int
  2020-04-23  2:25 ` [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int Sean Christopherson
@ 2020-04-28 21:42   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Return an actual bool for kvm_x86_ops' {interrupt_nmi}_allowed() hook to
> better reflect the return semantics, and to avoid creating an even
> bigger mess when the related VMX code is refactored in upcoming patches.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h
  2020-04-23  2:25 ` [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h Sean Christopherson
@ 2020-04-28 21:44   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:44 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Expose nested_exit_on_nmi() for use by vmx_nmi_allowed() in a future
> patch.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set
  2020-04-23  2:25 ` [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set Sean Christopherson
@ 2020-04-28 21:46   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Report NMIs as allowed when the vCPU is in L2 and L2 is being run with
> Exit-on-NMI enabled, as NMIs are always unblocked from L1's perspective
> in this case.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks
  2020-04-23  2:25 ` [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks Sean Christopherson
@ 2020-04-28 21:57   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Move the architectural (non-KVM specific) interrupt/NMI blocking checks
> to a separate helper so that they can be used in a future patch by
> vmx_check_nested_events().
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior
  2020-04-23  2:25 ` [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior Sean Christopherson
@ 2020-04-28 21:58   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 21:58 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Short circuit vmx_check_nested_events() if an unblocked IRQ/NMI is
> pending and needs to be injected into L2, priority between coincident
> events is not dependent on exiting behavior.
>
> Fixes: b6b8a1451fc4 ("KVM: nVMX: Rework interception of IRQs and NMIs")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-23  2:25 ` [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI Sean Christopherson
@ 2020-04-28 22:04   ` Jim Mattson
  2020-04-28 22:59     ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 22:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Check for an unblocked SMI in vmx_check_nested_events() so that pending
> SMIs are correctly prioritized over IRQs and NMIs when the latter events
> will trigger VM-Exit.  This also fixes an issue where an SMI that was
> marked pending while processing a nested VM-Enter wouldn't trigger an
> immediate exit, i.e. would be incorrectly delayed until L2 happened to
> take a VM-Exit.
>
> Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 1fdaca5fd93d..8c16b190816b 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
>                 return 0;
>         }
>
> +       if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
> +               if (block_nested_events)
> +                       return -EBUSY;
> +               goto no_vmexit;
> +       }
> +

From the SDM, volume 3:

• System-management interrupts (SMIs), INIT signals, and higher
priority events take priority over MTF VM exits.

I think this block needs to be moved up.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case
  2020-04-23  2:25 ` [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case Sean Christopherson
@ 2020-04-28 22:05   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 22:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> WARN if a pending exception is coincident with an injected exception
> before calling check_nested_events() so that the WARN will fire even if
> inject_pending_event() bails early because check_nested_events() detects
> the conflict.  Bailing early isn't problematic (quite the opposite), but
> suppressing the WARN is undesirable as it could mask a bug elsewhere in
> KVM.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit()
  2020-04-23  2:25 ` [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit() Sean Christopherson
@ 2020-04-28 22:07   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 22:07 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Use vmx_interrupt_blocked() instead of bouncing through
> vmx_interrupt_allowed() when handling edge cases in vmx_handle_exit().
> The nested_run_pending check in vmx_interrupt_allowed() should never
> evaluate true in the VM-Exit path.
>
> Hoist the WARN in handle_invalid_guest_state() up to vmx_handle_exit()
> to enforce the above assumption for the !enable_vnmi case, and to detect
> any other potential bugs with nested VM-Enter.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-23  2:25 ` [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix Sean Christopherson
  2020-04-23 11:00   ` Paolo Bonzini
@ 2020-04-28 22:12   ` Jim Mattson
  2020-04-28 22:20     ` Sean Christopherson
  1 sibling, 1 reply; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 22:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Add a separate hook for checking if interrupt injection is blocked and
> use the hook to handle the case where an interrupt arrives between
> check_nested_events() and the injection logic.  Drop the retry of
> check_nested_events() that hack-a-fixed the same condition.
>
> Blocking injection is also a bit of a hack, e.g. KVM should do exiting
> and non-exiting interrupt processing in a single pass, but it's a more
> precise hack.  The old comment is also misleading, e.g. KVM_REQ_EVENT is
> purely an optimization, setting it on every run loop (which KVM doesn't
> do) should not affect functionality, only performance.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/svm/svm.c          |  1 +
>  arch/x86/kvm/vmx/vmx.c          | 13 +++++++++++++
>  arch/x86/kvm/x86.c              | 22 ++++------------------
>  4 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 787636acd648..16fdeddb4a65 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1140,6 +1140,7 @@ struct kvm_x86_ops {
>         void (*queue_exception)(struct kvm_vcpu *vcpu);
>         void (*cancel_injection)(struct kvm_vcpu *vcpu);
>         bool (*interrupt_allowed)(struct kvm_vcpu *vcpu);
> +       bool (*interrupt_injection_allowed)(struct kvm_vcpu *vcpu);
>         bool (*nmi_allowed)(struct kvm_vcpu *vcpu);
>         bool (*get_nmi_mask)(struct kvm_vcpu *vcpu);
>         void (*set_nmi_mask)(struct kvm_vcpu *vcpu, bool masked);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index f21f734861dd..6d3ccbfc9e6a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3993,6 +3993,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>         .queue_exception = svm_queue_exception,
>         .cancel_injection = svm_cancel_injection,
>         .interrupt_allowed = svm_interrupt_allowed,
> +       .interrupt_injection_allowed = svm_interrupt_allowed,
>         .nmi_allowed = svm_nmi_allowed,
>         .get_nmi_mask = svm_get_nmi_mask,
>         .set_nmi_mask = svm_set_nmi_mask,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2f8cacb3aa9b..68b3748b5383 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4550,6 +4550,18 @@ static bool vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>         return !vmx_interrupt_blocked(vcpu);
>  }
>
> +static bool vmx_interrupt_injection_allowed(struct kvm_vcpu *vcpu)
> +{
> +       /*
> +        * An IRQ must not be injected into L2 if it's supposed to VM-Exit,
> +        * e.g. if the IRQ arrived asynchronously after checking nested events.
> +        */
> +       if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
> +               return false;
> +
> +       return vmx_interrupt_allowed(vcpu);
> +}
> +
>  static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
>  {
>         int ret;
> @@ -7823,6 +7835,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>         .queue_exception = vmx_queue_exception,
>         .cancel_injection = vmx_cancel_injection,
>         .interrupt_allowed = vmx_interrupt_allowed,
> +       .interrupt_injection_allowed = vmx_interrupt_injection_allowed,
>         .nmi_allowed = vmx_nmi_allowed,
>         .get_nmi_mask = vmx_get_nmi_mask,
>         .set_nmi_mask = vmx_set_nmi_mask,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7c49a7dc601f..d9d6028a77e0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7755,24 +7755,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
>                 --vcpu->arch.nmi_pending;
>                 vcpu->arch.nmi_injected = true;
>                 kvm_x86_ops.set_nmi(vcpu);
> -       } else if (kvm_cpu_has_injectable_intr(vcpu)) {
> -               /*
> -                * Because interrupts can be injected asynchronously, we are
> -                * calling check_nested_events again here to avoid a race condition.
> -                * See https://lkml.org/lkml/2014/7/2/60 for discussion about this
> -                * proposal and current concerns.  Perhaps we should be setting
> -                * KVM_REQ_EVENT only on certain events and not unconditionally?
> -                */
> -               if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
> -                       r = kvm_x86_ops.check_nested_events(vcpu);
> -                       if (r != 0)
> -                               return r;
> -               }
> -               if (kvm_x86_ops.interrupt_allowed(vcpu)) {
> -                       kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
> -                                           false);
> -                       kvm_x86_ops.set_irq(vcpu);
> -               }
> +       } else if (kvm_cpu_has_injectable_intr(vcpu) &&
> +                  kvm_x86_ops.interrupt_injection_allowed(vcpu)) {
> +               kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
> +               kvm_x86_ops.set_irq(vcpu);
>         }
So, that's what this mess was all about! Well, this certainly looks better.

Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked()
  2020-04-23  2:25 ` [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked() Sean Christopherson
@ 2020-04-28 22:13   ` Jim Mattson
  0 siblings, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 22:13 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Use vmx_get_rflags() instead of manually reading vmcs.GUEST_RFLAGS when
> querying RFLAGS.IF so that multiple checks against interrupt blocking in
> a single run loop only require a single VMREAD.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-28 22:12   ` Jim Mattson
@ 2020-04-28 22:20     ` Sean Christopherson
  2020-04-29  8:36       ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-28 22:20 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Tue, Apr 28, 2020 at 03:12:51PM -0700, Jim Mattson wrote:
> On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 7c49a7dc601f..d9d6028a77e0 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7755,24 +7755,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
> >                 --vcpu->arch.nmi_pending;
> >                 vcpu->arch.nmi_injected = true;
> >                 kvm_x86_ops.set_nmi(vcpu);
> > -       } else if (kvm_cpu_has_injectable_intr(vcpu)) {
> > -               /*
> > -                * Because interrupts can be injected asynchronously, we are
> > -                * calling check_nested_events again here to avoid a race condition.
> > -                * See https://lkml.org/lkml/2014/7/2/60 for discussion about this
> > -                * proposal and current concerns.  Perhaps we should be setting
> > -                * KVM_REQ_EVENT only on certain events and not unconditionally?
> > -                */
> > -               if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
> > -                       r = kvm_x86_ops.check_nested_events(vcpu);
> > -                       if (r != 0)
> > -                               return r;
> > -               }
> > -               if (kvm_x86_ops.interrupt_allowed(vcpu)) {
> > -                       kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
> > -                                           false);
> > -                       kvm_x86_ops.set_irq(vcpu);
> > -               }
> > +       } else if (kvm_cpu_has_injectable_intr(vcpu) &&
> > +                  kvm_x86_ops.interrupt_injection_allowed(vcpu)) {
> > +               kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
> > +               kvm_x86_ops.set_irq(vcpu);
> >         }
> So, that's what this mess was all about! Well, this certainly looks better.

Right?  I can't count the number of times I've looked at this code and
wondered what the hell it was doing.

Side topic, I just realized you're reviewing my original series.  Paolo
commandeered it to extend it to SVM. https://patchwork.kernel.org/cover/11508679/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-28 22:04   ` Jim Mattson
@ 2020-04-28 22:59     ` Sean Christopherson
  2020-04-28 23:16       ` Jim Mattson
  2020-04-28 23:23       ` Jim Mattson
  0 siblings, 2 replies; 39+ messages in thread
From: Sean Christopherson @ 2020-04-28 22:59 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote:
> On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Check for an unblocked SMI in vmx_check_nested_events() so that pending
> > SMIs are correctly prioritized over IRQs and NMIs when the latter events
> > will trigger VM-Exit.  This also fixes an issue where an SMI that was
> > marked pending while processing a nested VM-Enter wouldn't trigger an
> > immediate exit, i.e. would be incorrectly delayed until L2 happened to
> > take a VM-Exit.
> >
> > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support")
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 1fdaca5fd93d..8c16b190816b 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
> >                 return 0;
> >         }
> >
> > +       if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
> > +               if (block_nested_events)
> > +                       return -EBUSY;
> > +               goto no_vmexit;
> > +       }
> > +
> 
> From the SDM, volume 3:
> 
> • System-management interrupts (SMIs), INIT signals, and higher
> priority events take priority over MTF VM exits.
> 
> I think this block needs to be moved up.

Hrm.  It definitely needs to be moved above the preemption timer, though I
can't find any public documentation about the preemption timer's priority.
Preemption timer is lower priority than MTF, ergo it's not in the same
class as SMI.

Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF
and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM.  I
think it makes sense to take the easy road and keep SMI after the traps,
with a comment to say it's technically wrong but not worth fixing.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-28 22:59     ` Sean Christopherson
@ 2020-04-28 23:16       ` Jim Mattson
  2020-04-29 14:50         ` Sean Christopherson
  2020-04-28 23:23       ` Jim Mattson
  1 sibling, 1 reply; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 23:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote:
> > On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > Check for an unblocked SMI in vmx_check_nested_events() so that pending
> > > SMIs are correctly prioritized over IRQs and NMIs when the latter events
> > > will trigger VM-Exit.  This also fixes an issue where an SMI that was
> > > marked pending while processing a nested VM-Enter wouldn't trigger an
> > > immediate exit, i.e. would be incorrectly delayed until L2 happened to
> > > take a VM-Exit.
> > >
> > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support")
> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > > ---
> > >  arch/x86/kvm/vmx/nested.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > index 1fdaca5fd93d..8c16b190816b 100644
> > > --- a/arch/x86/kvm/vmx/nested.c
> > > +++ b/arch/x86/kvm/vmx/nested.c
> > > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
> > >                 return 0;
> > >         }
> > >
> > > +       if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
> > > +               if (block_nested_events)
> > > +                       return -EBUSY;
> > > +               goto no_vmexit;
> > > +       }
> > > +
> >
> > From the SDM, volume 3:
> >
> > • System-management interrupts (SMIs), INIT signals, and higher
> > priority events take priority over MTF VM exits.
> >
> > I think this block needs to be moved up.
>
> Hrm.  It definitely needs to be moved above the preemption timer, though I
> can't find any public documentation about the preemption timer's priority.
> Preemption timer is lower priority than MTF, ergo it's not in the same
> class as SMI.
>
> Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF
> and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM.  I
> think it makes sense to take the easy road and keep SMI after the traps,
> with a comment to say it's technically wrong but not worth fixing.

Pending debug exceptions should just go in the pending debug
exceptions field. End of story and end of complications. I don't
understand why kvm is so averse to using this field the way it was
intended.

As for the MTF, section 34.14.1 of the SDM, volume 3, clearly states:

The pseudocode above makes reference to the saving of VMX-critical
state. This state consists of the following:
(1) SS.DPL (the current privilege level); (2) RFLAGS.VM; (3) the state
of blocking by STI and by MOV SS (see
Table 24-3 in Section 24.4.2); (4) the state of virtual-NMI blocking
(only if the processor is in VMX non-root oper-
ation and the “virtual NMIs” VM-execution control is 1); and (5) an
indication of whether an MTF VM exit is pending
(see Section 25.5.2). These data may be saved internal to the
processor or in the VMCS region of the current
VMCS. Processors that do not support SMI recognition while there is
blocking by STI or by MOV SS need not save
the state of such blocking.

I haven't really looked at kvm's implementation of SMM (because Google
doesn't support it), but it seems that the "MTF VM exit is pending"
bit should be trivial to deal with. I assume we save the other
VMX-critical state somewhere!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-28 22:59     ` Sean Christopherson
  2020-04-28 23:16       ` Jim Mattson
@ 2020-04-28 23:23       ` Jim Mattson
  1 sibling, 0 replies; 39+ messages in thread
From: Jim Mattson @ 2020-04-28 23:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote:
> > On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > Check for an unblocked SMI in vmx_check_nested_events() so that pending
> > > SMIs are correctly prioritized over IRQs and NMIs when the latter events
> > > will trigger VM-Exit.  This also fixes an issue where an SMI that was
> > > marked pending while processing a nested VM-Enter wouldn't trigger an
> > > immediate exit, i.e. would be incorrectly delayed until L2 happened to
> > > take a VM-Exit.
> > >
> > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support")
> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > > ---
> > >  arch/x86/kvm/vmx/nested.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > index 1fdaca5fd93d..8c16b190816b 100644
> > > --- a/arch/x86/kvm/vmx/nested.c
> > > +++ b/arch/x86/kvm/vmx/nested.c
> > > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
> > >                 return 0;
> > >         }
> > >
> > > +       if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
> > > +               if (block_nested_events)
> > > +                       return -EBUSY;
> > > +               goto no_vmexit;
> > > +       }
> > > +
> >
> > From the SDM, volume 3:
> >
> > • System-management interrupts (SMIs), INIT signals, and higher
> > priority events take priority over MTF VM exits.
> >
> > I think this block needs to be moved up.
>
> Hrm.  It definitely needs to be moved above the preemption timer, though I
> can't find any public documentation about the preemption timer's priority.

Section 25.2 of the SDM, volume 3:

Debug-trap exceptions and higher priority events take priority over VM
exits caused by the VMX-preemption
timer. VM exits caused by the VMX-preemption timer take priority over
VM exits caused by the “NMI-window
exiting” VM-execution control and lower priority events.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-28 22:20     ` Sean Christopherson
@ 2020-04-29  8:36       ` Paolo Bonzini
  2020-04-29 16:45         ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2020-04-29  8:36 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm list, LKML,
	Oliver Upton, Peter Shier

On 29/04/20 00:20, Sean Christopherson wrote:
>> So, that's what this mess was all about! Well, this certainly looks better.
> Right?  I can't count the number of times I've looked at this code and
> wondered what the hell it was doing.
> 
> Side topic, I just realized you're reviewing my original series.  Paolo
> commandeered it to extend it to SVM. https://patchwork.kernel.org/cover/11508679/

If you can just send a patch to squash into 9/13 I can take care of it.

Paolo


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-28 23:16       ` Jim Mattson
@ 2020-04-29 14:50         ` Sean Christopherson
  2020-04-29 20:06           ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-29 14:50 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Tue, Apr 28, 2020 at 04:16:16PM -0700, Jim Mattson wrote:
> On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote:
> > > From the SDM, volume 3:
> > >
> > > • System-management interrupts (SMIs), INIT signals, and higher
> > > priority events take priority over MTF VM exits.
> > >
> > > I think this block needs to be moved up.
> >
> > Hrm.  It definitely needs to be moved above the preemption timer, though I
> > can't find any public documentation about the preemption timer's priority.
> > Preemption timer is lower priority than MTF, ergo it's not in the same
> > class as SMI.
> >
> > Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF
> > and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM.  I
> > think it makes sense to take the easy road and keep SMI after the traps,
> > with a comment to say it's technically wrong but not worth fixing.
> 
> Pending debug exceptions should just go in the pending debug
> exceptions field. End of story and end of complications. I don't
> understand why kvm is so averse to using this field the way it was
> intended.

Ah, it took my brain a bit to catch on.  I assume you're suggesting calling
nested_vmx_updated_pending_dbg() so that the pending #DB naturally gets
propagated to/from vmcs12 on SMI/RSM?  I think that should work.

> As for the MTF, section 34.14.1 of the SDM, volume 3, clearly states:
> 
> The pseudocode above makes reference to the saving of VMX-critical
> state. This state consists of the following:
> (1) SS.DPL (the current privilege level); (2) RFLAGS.VM; (3) the state
> of blocking by STI and by MOV SS (see
> Table 24-3 in Section 24.4.2); (4) the state of virtual-NMI blocking
> (only if the processor is in VMX non-root oper-
> ation and the “virtual NMIs” VM-execution control is 1); and (5) an
> indication of whether an MTF VM exit is pending
> (see Section 25.5.2). These data may be saved internal to the
> processor or in the VMCS region of the current
> VMCS. Processors that do not support SMI recognition while there is
> blocking by STI or by MOV SS need not save
> the state of such blocking.
> 
> I haven't really looked at kvm's implementation of SMM (because Google
> doesn't support it), but it seems that the "MTF VM exit is pending"
> bit should be trivial to deal with. I assume we save the other
> VMX-critical state somewhere!

True, I spaced on the extistence of vmx_pre_{enter,leave}_smm().

I'll send a patch, the delta to what's in kvm/queue should actually be
quite small.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-29  8:36       ` Paolo Bonzini
@ 2020-04-29 16:45         ` Sean Christopherson
  2020-04-29 16:58           ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: Sean Christopherson @ 2020-04-29 16:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jim Mattson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 29, 2020 at 10:36:17AM +0200, Paolo Bonzini wrote:
> On 29/04/20 00:20, Sean Christopherson wrote:
> >> So, that's what this mess was all about! Well, this certainly looks better.
> > Right?  I can't count the number of times I've looked at this code and
> > wondered what the hell it was doing.
> > 
> > Side topic, I just realized you're reviewing my original series.  Paolo
> > commandeered it to extend it to SVM. https://patchwork.kernel.org/cover/11508679/
> 
> If you can just send a patch to squash into 9/13 I can take care of it.

Ugh, correctly prioritizing SMI is a mess.  It has migration implications,
a proper fix requires non-trivial changes to inject_pending_event(), there
are pre-existing (minor) bugs related to MTF handling, and technically INIT
should have lower priority than non-trap exceptions (because the exception
happens before the event window is opened).

Can you just drop 9/13, "Prioritize SMI over nested IRQ/NMI" from kvm/queue?
It's probably best to deal with this in a new series rather than trying to
squeeze it in.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-29 16:45         ` Sean Christopherson
@ 2020-04-29 16:58           ` Paolo Bonzini
  2020-04-29 17:07             ` Sean Christopherson
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2020-04-29 16:58 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Jim Mattson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On 29/04/20 18:45, Sean Christopherson wrote:
> 
> Can you just drop 9/13, "Prioritize SMI over nested IRQ/NMI" from kvm/queue?
> It's probably best to deal with this in a new series rather than trying to
> squeeze it in.

With AMD we just have IRQ/NMI/SMI, and it's important to handle SMI in
check_nested_events because you can turn SMIs into vmexit without stuff
such as dual-monitor treatment.  On the other hand there is no MTF and
we're not handling exceptions yet.  So, since SMIs should be pretty rare
anyway, I'd rather just add a comment detailing the correct order and
why we're not following it.  The minimal fix would be to move SMI above
the preemption timer, right?

Paolo


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix
  2020-04-29 16:58           ` Paolo Bonzini
@ 2020-04-29 17:07             ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2020-04-29 17:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jim Mattson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 29, 2020 at 06:58:45PM +0200, Paolo Bonzini wrote:
> On 29/04/20 18:45, Sean Christopherson wrote:
> > 
> > Can you just drop 9/13, "Prioritize SMI over nested IRQ/NMI" from kvm/queue?
> > It's probably best to deal with this in a new series rather than trying to
> > squeeze it in.
> 
> With AMD we just have IRQ/NMI/SMI, and it's important to handle SMI in

Ah, forgot about that angle.

> check_nested_events because you can turn SMIs into vmexit without stuff
> such as dual-monitor treatment.  On the other hand there is no MTF and
> we're not handling exceptions yet.  So, since SMIs should be pretty rare
> anyway, I'd rather just add a comment detailing the correct order and
> why we're not following it.  The minimal fix would be to move SMI above
> the preemption timer, right?

Yep, that works for now.

I'd still like to do a full fix for SMI and INIT.  Correctness aside, I
think/hope the changes I have in mind will make it easier to connect the
dots betwen KVM's event priority and the SDM's event priority.  But that
can definitely wait for 5.9.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI
  2020-04-29 14:50         ` Sean Christopherson
@ 2020-04-29 20:06           ` Sean Christopherson
  0 siblings, 0 replies; 39+ messages in thread
From: Sean Christopherson @ 2020-04-29 20:06 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Oliver Upton, Peter Shier

On Wed, Apr 29, 2020 at 07:50:40AM -0700, Sean Christopherson wrote:
> On Tue, Apr 28, 2020 at 04:16:16PM -0700, Jim Mattson wrote:
> > On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote:
> > > > From the SDM, volume 3:
> > > >
> > > > • System-management interrupts (SMIs), INIT signals, and higher
> > > > priority events take priority over MTF VM exits.
> > > >
> > > > I think this block needs to be moved up.
> > >
> > > Hrm.  It definitely needs to be moved above the preemption timer, though I
> > > can't find any public documentation about the preemption timer's priority.
> > > Preemption timer is lower priority than MTF, ergo it's not in the same
> > > class as SMI.
> > >
> > > Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF
> > > and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM.  I
> > > think it makes sense to take the easy road and keep SMI after the traps,
> > > with a comment to say it's technically wrong but not worth fixing.
> > 
> > Pending debug exceptions should just go in the pending debug
> > exceptions field. End of story and end of complications. I don't
> > understand why kvm is so averse to using this field the way it was
> > intended.
> 
> Ah, it took my brain a bit to catch on.  I assume you're suggesting calling
> nested_vmx_updated_pending_dbg() so that the pending #DB naturally gets
> propagated to/from vmcs12 on SMI/RSM?  I think that should work.

This works for L2, but not L1 :-(  And L2 can't be fixed without first
fixing L1 because inject_pending_event() also incorrectly prioritizes #DB
over SMI.  For L1, utilizing SMRAM to save/restore the pending #DB is
likely the easiest solution as it avoids having to add new state for
migration.

I have everything coded up but it'll probably take a few weeks to test and
get it sent out, need to focus on other stuff for a while.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2020-04-29 20:06 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23  2:25 [PATCH 00/13] KVM: x86: Event fixes and cleanup Sean Christopherson
2020-04-23  2:25 ` [PATCH 01/13] KVM: nVMX: Preserve exception priority irrespective of exiting behavior Sean Christopherson
2020-04-28 18:54   ` Jim Mattson
2020-04-28 20:07     ` Oliver Upton
2020-04-23  2:25 ` [PATCH 02/13] KVM: nVMX: Open a window for pending nested VMX preemption timer Sean Christopherson
2020-04-28 21:39   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 03/13] KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set Sean Christopherson
2020-04-28 21:41   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 04/13] KVM: x86: Make return for {interrupt_nmi}_allowed() a bool instead of int Sean Christopherson
2020-04-28 21:42   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 05/13] KVM: nVMX: Move nested_exit_on_nmi() to nested.h Sean Christopherson
2020-04-28 21:44   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 06/13] KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set Sean Christopherson
2020-04-28 21:46   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 07/13] KVM: VMX: Split out architectural interrupt/NMI blocking checks Sean Christopherson
2020-04-28 21:57   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 08/13] KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior Sean Christopherson
2020-04-28 21:58   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 09/13] KVM: nVMX: Prioritize SMI over nested IRQ/NMI Sean Christopherson
2020-04-28 22:04   ` Jim Mattson
2020-04-28 22:59     ` Sean Christopherson
2020-04-28 23:16       ` Jim Mattson
2020-04-29 14:50         ` Sean Christopherson
2020-04-29 20:06           ` Sean Christopherson
2020-04-28 23:23       ` Jim Mattson
2020-04-23  2:25 ` [PATCH 10/13] KVM: x86: WARN on injected+pending exception even in nested case Sean Christopherson
2020-04-28 22:05   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 11/13] KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit() Sean Christopherson
2020-04-28 22:07   ` Jim Mattson
2020-04-23  2:25 ` [PATCH 12/13] KVM: x86: Replace late check_nested_events() hack with more precise fix Sean Christopherson
2020-04-23 11:00   ` Paolo Bonzini
2020-04-28 22:12   ` Jim Mattson
2020-04-28 22:20     ` Sean Christopherson
2020-04-29  8:36       ` Paolo Bonzini
2020-04-29 16:45         ` Sean Christopherson
2020-04-29 16:58           ` Paolo Bonzini
2020-04-29 17:07             ` Sean Christopherson
2020-04-23  2:25 ` [PATCH 13/13] KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked() Sean Christopherson
2020-04-28 22:13   ` Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).