All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup
@ 2019-04-20  5:50 Sean Christopherson
  2019-04-20  5:50 ` [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry Sean Christopherson
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

This series' primarily focus is to refine VMX's handling of INTRs, NMIs
and #MCs after a VM-Exit, making a few small optimizations and hopefully
resulting in more readable code.

There's also a bug fix related to handling #MCs that occur during VM-Entry
that was found by inspection when doing the aforementioned cleanup.

Sean Christopherson (5):
  KVM: VMX: Fix handling of #MC that occurs during VM-Entry
  KVM: VMX: Read cached VM-Exit reason to detect external interrupt
  KVM: VMX: Store the host kernel's IDT base in a global variable
  KVM: x86: Move kvm_{before,after}_interrupt() calls to vendor code
  KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn

 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/svm.c              |   6 +-
 arch/x86/kvm/vmx/vmcs.h         |   6 ++
 arch/x86/kvm/vmx/vmx.c          | 111 +++++++++++++++++---------------
 arch/x86/kvm/vmx/vmx.h          |   1 -
 arch/x86/kvm/x86.c              |   4 +-
 6 files changed, 71 insertions(+), 59 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry
  2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
@ 2019-04-20  5:50 ` Sean Christopherson
  2019-06-06 12:57   ` Paolo Bonzini
  2019-04-20  5:50 ` [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt Sean Christopherson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

A previous fix to prevent KVM from consuming stale VMCS state after a
failed VM-Entry inadvertantly blocked KVM's handling of machine checks
that occur during VM-Entry.

Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways,
depending on when the #MC is recognoized.  As it pertains to this bug
fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY
is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to
indicate the VM-Entry failed.

If a machine-check event occurs during a VM entry, one of the following occurs:
 - The machine-check event is handled as if it occurred before the VM entry:
        ...
 - The machine-check event is handled after VM entry completes:
        ...
 - A VM-entry failure occurs as described in Section 26.7. The basic
   exit reason is 41, for "VM-entry failure due to machine-check event".

Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in
vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit().
Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY
in a sane fashion and also simplifies vmx_complete_atomic_exit() since
VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh.

Fixes: b060ca3b2e9e7 ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly")
Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d8f101b58ab8..79ce9c7062f9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6103,28 +6103,21 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
 
 static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 {
-	u32 exit_intr_info = 0;
-	u16 basic_exit_reason = (u16)vmx->exit_reason;
-
-	if (!(basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY
-	      || basic_exit_reason == EXIT_REASON_EXCEPTION_NMI))
+	if (vmx->exit_reason != EXIT_REASON_EXCEPTION_NMI)
 		return;
 
-	if (!(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
-		exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
-	vmx->exit_intr_info = exit_intr_info;
+	vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 
 	/* if exit due to PF check for async PF */
-	if (is_page_fault(exit_intr_info))
+	if (is_page_fault(vmx->exit_intr_info))
 		vmx->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
 
 	/* Handle machine checks before interrupts are enabled */
-	if (basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY ||
-	    is_machine_check(exit_intr_info))
+	if (is_machine_check(vmx->exit_intr_info))
 		kvm_machine_check();
 
 	/* We need to handle NMIs before interrupts are enabled */
-	if (is_nmi(exit_intr_info)) {
+	if (is_nmi(vmx->exit_intr_info)) {
 		kvm_before_interrupt(&vmx->vcpu);
 		asm("int $2");
 		kvm_after_interrupt(&vmx->vcpu);
@@ -6527,6 +6520,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	vmx->idt_vectoring_info = 0;
 
 	vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON);
+	if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY)
+		kvm_machine_check();
+
 	if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
 		return;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt
  2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
  2019-04-20  5:50 ` [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry Sean Christopherson
@ 2019-04-20  5:50 ` Sean Christopherson
  2019-06-06 13:02   ` Paolo Bonzini
  2019-04-20  5:50 ` [PATCH 3/5] KVM: VMX: Store the host kernel's IDT base in a global variable Sean Christopherson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

Generic x86 code blindly invokes the dedicated external interrupt
handler blindly, i.e. vmx_handle_external_intr() is called on all
VM-Exits regardless of the actual exit type.  Use the already-cached
EXIT_REASON to determine if the VM-Exit was due to an interrupt, thus
avoiding an extra VMREAD (to query VM_EXIT_INTR_INFO) for all other
types of VM-Exit.

In addition to avoiding the extra VMREAD, checking the EXIT_REASON
instead of VM_EXIT_INTR_INFO makes it more obvious that
vmx_handle_external_intr() is called for all VM-Exits, e.g. someone
unfamiliar with the flow might wonder under what condition(s)
VM_EXIT_INTR_INFO does not contain a valid interrupt, which is
simply not possible since KVM always runs with "ack interrupt on exit".

WARN once if VM_EXIT_INTR_INFO doesn't contain a valid interrupt on
an EXTERNAL_INTERRUPT VM-Exit, as such a condition would indicate a
hardware bug.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmcs.h |  6 +++++
 arch/x86/kvm/vmx/vmx.c  | 60 +++++++++++++++++++++--------------------
 2 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index cb6079f8a227..971a46c69df4 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -115,6 +115,12 @@ static inline bool is_nmi(u32 intr_info)
 		== (INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK);
 }
 
+static inline bool is_external_intr(u32 intr_info)
+{
+	return (intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
+		== (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR);
+}
+
 enum vmcs_field_width {
 	VMCS_FIELD_WIDTH_U16 = 0,
 	VMCS_FIELD_WIDTH_U64 = 1,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 79ce9c7062f9..58e83fc86ad6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6126,42 +6126,44 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 
 static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 {
-	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
-
-	if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
-			== (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) {
-		unsigned int vector;
-		unsigned long entry;
-		gate_desc *desc;
-		struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned int vector;
+	unsigned long entry;
 #ifdef CONFIG_X86_64
-		unsigned long tmp;
+	unsigned long tmp;
 #endif
+	u32 intr_info;
+
+	if (to_vmx(vcpu)->exit_reason != EXIT_REASON_EXTERNAL_INTERRUPT)
+		return;
+
+	intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+	if (WARN_ONCE(!is_external_intr(intr_info),
+	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
+		return;
+
+	vector = intr_info & INTR_INFO_VECTOR_MASK;
+	entry = gate_offset((gate_desc *)to_vmx(vcpu)->host_idt_base + vector);
 
-		vector =  exit_intr_info & INTR_INFO_VECTOR_MASK;
-		desc = (gate_desc *)vmx->host_idt_base + vector;
-		entry = gate_offset(desc);
-		asm volatile(
+	asm volatile(
 #ifdef CONFIG_X86_64
-			"mov %%" _ASM_SP ", %[sp]\n\t"
-			"and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
-			"push $%c[ss]\n\t"
-			"push %[sp]\n\t"
+		"mov %%" _ASM_SP ", %[sp]\n\t"
+		"and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
+		"push $%c[ss]\n\t"
+		"push %[sp]\n\t"
 #endif
-			"pushf\n\t"
-			__ASM_SIZE(push) " $%c[cs]\n\t"
-			CALL_NOSPEC
-			:
+		"pushf\n\t"
+		__ASM_SIZE(push) " $%c[cs]\n\t"
+		CALL_NOSPEC
+		:
 #ifdef CONFIG_X86_64
-			[sp]"=&r"(tmp),
+		[sp]"=&r"(tmp),
 #endif
-			ASM_CALL_CONSTRAINT
-			:
-			THUNK_TARGET(entry),
-			[ss]"i"(__KERNEL_DS),
-			[cs]"i"(__KERNEL_CS)
-			);
-	}
+		ASM_CALL_CONSTRAINT
+		:
+		THUNK_TARGET(entry),
+		[ss]"i"(__KERNEL_DS),
+		[cs]"i"(__KERNEL_CS)
+	);
 }
 STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] KVM: VMX: Store the host kernel's IDT base in a global variable
  2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
  2019-04-20  5:50 ` [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry Sean Christopherson
  2019-04-20  5:50 ` [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt Sean Christopherson
@ 2019-04-20  5:50 ` Sean Christopherson
  2019-04-20 14:17   ` [RFC PATCH] KVM: VMX: host_idt_base can be static kbuild test robot
  2019-04-20  5:50 ` [PATCH 4/5] KVM: x86: Move kvm_{before,after}_interrupt() calls to vendor code Sean Christopherson
  2019-04-20  5:50 ` [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn Sean Christopherson
  4 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

Although the kernel may use multiple IDTs, KVM should only ever see the
"real" IDT, e.g. the early init IDT is long gone by the time KVM runs
and the debug stack IDT is only used for small windows of time in very
specific flows.

Before commit a547c6db4d2f1 ("KVM: VMX: Enable acknowledge interupt on
vmexit"), the kernel's IDT base was consumed by KVM only when setting
constant VMCS state, i.e. to set VMCS.HOST_IDTR_BASE.  Because constant
host state is done once per vCPU, there was ostensibly no need to cache
the kernel's IDT base.

When support for "ack interrupt on exit" was introduced, KVM added a
second consumer of the IDT base as handling already-acked interrupts
requires directly calling the interrupt handler, i.e. KVM uses the IDT
base to find the address of the handler.  Because interrupts are a fast
path, KVM cached the IDT base to avoid having to VMREAD HOST_IDTR_BASE.
Presumably, the IDT base was cached on a per-vCPU basis simply because
the existing code grabbed the IDT base on a per-vCPU (VMCS) basis.

Note, all post-boot IDTs use the same handlers for external interrupts,
i.e. the "ack interrupt on exit" use of the IDT base would be unaffected
even if the cached IDT somehow did not match the current IDT.  And as
for the original use case of setting VMCS.HOST_IDTR_BASE, if any of the
above analysis is wrong then KVM has had a bug since the beginning of
time since KVM has effectively been caching the IDT at vCPU creation
since commit a8b732ca01c ("[PATCH] kvm: userspace interface").

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
 arch/x86/kvm/vmx/vmx.h |  1 -
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 58e83fc86ad6..897f360a4cfa 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -389,6 +389,7 @@ static const struct kvm_vmx_segment_field {
 };
 
 u64 host_efer;
+unsigned long host_idt_base;
 
 /*
  * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
@@ -3732,7 +3733,6 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
 {
 	u32 low32, high32;
 	unsigned long tmpl;
-	struct desc_ptr dt;
 	unsigned long cr0, cr3, cr4;
 
 	cr0 = read_cr0();
@@ -3768,9 +3768,7 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
 	vmcs_write16(HOST_SS_SELECTOR, __KERNEL_DS);  /* 22.2.4 */
 	vmcs_write16(HOST_TR_SELECTOR, GDT_ENTRY_TSS*8);  /* 22.2.4 */
 
-	store_idt(&dt);
-	vmcs_writel(HOST_IDTR_BASE, dt.address);   /* 22.2.4 */
-	vmx->host_idt_base = dt.address;
+	vmcs_writel(HOST_IDTR_BASE, host_idt_base);   /* 22.2.4 */
 
 	vmcs_writel(HOST_RIP, (unsigned long)vmx_vmexit); /* 22.2.5 */
 
@@ -6142,7 +6140,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 		return;
 
 	vector = intr_info & INTR_INFO_VECTOR_MASK;
-	entry = gate_offset((gate_desc *)to_vmx(vcpu)->host_idt_base + vector);
+	entry = gate_offset((gate_desc *)host_idt_base + vector);
 
 	asm volatile(
 #ifdef CONFIG_X86_64
@@ -7443,10 +7441,14 @@ static bool vmx_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 static __init int hardware_setup(void)
 {
 	unsigned long host_bndcfgs;
+	struct desc_ptr dt;
 	int r, i;
 
 	rdmsrl_safe(MSR_EFER, &host_efer);
 
+	store_idt(&dt);
+	host_idt_base = dt.address;
+
 	for (i = 0; i < ARRAY_SIZE(vmx_msr_index); ++i)
 		kvm_define_shared_msr(i, vmx_msr_index[i]);
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 1e42f983e0f1..d66a0f453469 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -184,7 +184,6 @@ struct vcpu_vmx {
 	int                   nmsrs;
 	int                   save_nmsrs;
 	bool                  guest_msrs_dirty;
-	unsigned long	      host_idt_base;
 #ifdef CONFIG_X86_64
 	u64		      msr_host_kernel_gs_base;
 	u64		      msr_guest_kernel_gs_base;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] KVM: x86: Move kvm_{before,after}_interrupt() calls to vendor code
  2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
                   ` (2 preceding siblings ...)
  2019-04-20  5:50 ` [PATCH 3/5] KVM: VMX: Store the host kernel's IDT base in a global variable Sean Christopherson
@ 2019-04-20  5:50 ` Sean Christopherson
  2019-04-20  5:50 ` [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn Sean Christopherson
  4 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

VMX can conditionally call kvm_{before,after}_interrupt() since KVM
always uses "ack interrupt on exit" and therefore explicitly handles
interrupts as opposed to blindly enabling irqs.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm.c     | 2 ++
 arch/x86/kvm/vmx/vmx.c | 4 ++++
 arch/x86/kvm/x86.c     | 2 --
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 406b558abfef..38e1c7d382a1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6162,6 +6162,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
 
 static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
 {
+	kvm_before_interrupt(vcpu);
 	local_irq_enable();
 	/*
 	 * We must have an instruction with interrupts enabled, so
@@ -6169,6 +6170,7 @@ static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
 	 */
 	asm("nop");
 	local_irq_disable();
+	kvm_after_interrupt(vcpu);
 }
 
 static void svm_sched_in(struct kvm_vcpu *vcpu, int cpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 897f360a4cfa..1fbd5a5dd6af 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6142,6 +6142,8 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 	vector = intr_info & INTR_INFO_VECTOR_MASK;
 	entry = gate_offset((gate_desc *)host_idt_base + vector);
 
+	kvm_before_interrupt(vcpu);
+
 	asm volatile(
 #ifdef CONFIG_X86_64
 		"mov %%" _ASM_SP ", %[sp]\n\t"
@@ -6162,6 +6164,8 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 		[ss]"i"(__KERNEL_DS),
 		[cs]"i"(__KERNEL_CS)
 	);
+
+	kvm_after_interrupt(vcpu);
 }
 STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c09507057743..7aa002b12f25 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7945,9 +7945,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
 
-	kvm_before_interrupt(vcpu);
 	kvm_x86_ops->handle_external_intr(vcpu);
-	kvm_after_interrupt(vcpu);
 
 	++vcpu->stat.exits;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn
  2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
                   ` (3 preceding siblings ...)
  2019-04-20  5:50 ` [PATCH 4/5] KVM: x86: Move kvm_{before,after}_interrupt() calls to vendor code Sean Christopherson
@ 2019-04-20  5:50 ` Sean Christopherson
  2019-06-06 13:20   ` Paolo Bonzini
  4 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2019-04-20  5:50 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář, Joerg Roedel; +Cc: kvm, Jim Mattson

Per commit 1b6269db3f833 ("KVM: VMX: Handle NMIs before enabling
interrupts and preemption"), NMIs are handled directly in vmx_vcpu_run()
to "make sure we handle NMI on the current cpu, and that we don't
service maskable interrupts before non-maskable ones".  The other
exceptions handled by complete_atomic_exit(), e.g. async #PF and #MC,
have similar requirements, and are located there to avoid extra VMREADs
since VMX bins hardware exceptions and NMIs into a single exit reason.

Clean up the code and eliminate the vaguely named complete_atomic_exit()
by moving the interrupts-disabled exception and NMI handling into the
existing handle_external_intrs() callback, and rename the callback to
a more appropriate name.

In addition to improving code readability, this also ensures the NMI
handler is run with the host's debug registers loaded in the unlikely
event that the user is debugging NMIs.  Accuracy of the last_guest_tsc
field is also improved when handling NMIs (and #MCs) as the handler
will run after updating said field.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c              |  4 ++--
 arch/x86/kvm/vmx/vmx.c          | 25 ++++++++++++++-----------
 arch/x86/kvm/x86.c              |  2 +-
 4 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8d68ba0cba0c..cd60c3ae7f66 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1109,7 +1109,7 @@ struct kvm_x86_ops {
 	int (*check_intercept)(struct kvm_vcpu *vcpu,
 			       struct x86_instruction_info *info,
 			       enum x86_intercept_stage stage);
-	void (*handle_external_intr)(struct kvm_vcpu *vcpu);
+	void (*handle_events_irqs_disabled)(struct kvm_vcpu *vcpu);
 	bool (*mpx_supported)(void);
 	bool (*xsaves_supported)(void);
 	bool (*umip_emulated)(void);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 38e1c7d382a1..e117058eba87 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6160,7 +6160,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
 	return ret;
 }
 
-static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
+static void svm_handle_events_irqs_disabled(struct kvm_vcpu *vcpu)
 {
 	kvm_before_interrupt(vcpu);
 	local_irq_enable();
@@ -7256,7 +7256,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.set_tdp_cr3 = set_tdp_cr3,
 
 	.check_intercept = svm_check_intercept,
-	.handle_external_intr = svm_handle_external_intr,
+	.handle_events_irqs_disabled = svm_handle_events_irqs_disabled,
 
 	.request_immediate_exit = __kvm_request_immediate_exit,
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1fbd5a5dd6af..9b580749217f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4441,7 +4441,7 @@ static void kvm_machine_check(void)
 
 static int handle_machine_check(struct kvm_vcpu *vcpu)
 {
-	/* already handled by vcpu_run */
+	/* handled by vmx_handle_events_irqs_disabled() */
 	return 1;
 }
 
@@ -4461,7 +4461,7 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 		return handle_machine_check(vcpu);
 
 	if (is_nmi(intr_info))
-		return 1;  /* already handled by vmx_vcpu_run() */
+		return 1; /* handled by vmx_handle_events_irqs_disabled() */
 
 	if (is_invalid_opcode(intr_info))
 		return handle_ud(vcpu);
@@ -6099,11 +6099,8 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
 	memset(vmx->pi_desc.pir, 0, sizeof(vmx->pi_desc.pir));
 }
 
-static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
+static void vmx_handle_exception_nmi_irqs_disabled(struct vcpu_vmx *vmx)
 {
-	if (vmx->exit_reason != EXIT_REASON_EXCEPTION_NMI)
-		return;
-
 	vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 
 	/* if exit due to PF check for async PF */
@@ -6131,9 +6128,6 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 #endif
 	u32 intr_info;
 
-	if (to_vmx(vcpu)->exit_reason != EXIT_REASON_EXTERNAL_INTERRUPT)
-		return;
-
 	intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 	if (WARN_ONCE(!is_external_intr(intr_info),
 	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
@@ -6169,6 +6163,16 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 }
 STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);
 
+static void vmx_handle_events_irqs_disabled(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
+		vmx_handle_exception_nmi_irqs_disabled(vmx);
+	else if (vmx->exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+		vmx_handle_external_intr(vcpu);
+}
+
 static bool vmx_has_emulated_msr(int index)
 {
 	switch (index) {
@@ -6533,7 +6537,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	vmx->loaded_vmcs->launched = 1;
 	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
-	vmx_complete_atomic_exit(vmx);
 	vmx_recover_nmi_blocking(vmx);
 	vmx_complete_interrupts(vmx);
 }
@@ -7708,7 +7711,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.set_tdp_cr3 = vmx_set_cr3,
 
 	.check_intercept = vmx_check_intercept,
-	.handle_external_intr = vmx_handle_external_intr,
+	.handle_events_irqs_disabled = vmx_handle_events_irqs_disabled,
 	.mpx_supported = vmx_mpx_supported,
 	.xsaves_supported = vmx_xsaves_supported,
 	.umip_emulated = vmx_umip_emulated,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7aa002b12f25..82d320f42b1d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7945,7 +7945,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
 
-	kvm_x86_ops->handle_external_intr(vcpu);
+	kvm_x86_ops->handle_events_irqs_disabled(vcpu);
 
 	++vcpu->stat.exits;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH] KVM: VMX: host_idt_base can be static
  2019-04-20  5:50 ` [PATCH 3/5] KVM: VMX: Store the host kernel's IDT base in a global variable Sean Christopherson
@ 2019-04-20 14:17   ` kbuild test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2019-04-20 14:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kbuild-all, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, kvm, Jim Mattson


Fixes: 94952ddb99f7 ("KVM: VMX: Store the host kernel's IDT base in a global variable")
Signed-off-by: kbuild test robot <lkp@intel.com>
---
 vmx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7061df8a..429f373 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -389,7 +389,7 @@ static const struct kvm_vmx_segment_field {
 };
 
 u64 host_efer;
-unsigned long host_idt_base;
+static unsigned long host_idt_base;
 
 /*
  * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry
  2019-04-20  5:50 ` [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry Sean Christopherson
@ 2019-06-06 12:57   ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2019-06-06 12:57 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář, Joerg Roedel
  Cc: kvm, Jim Mattson

On 20/04/19 07:50, Sean Christopherson wrote:
> A previous fix to prevent KVM from consuming stale VMCS state after a
> failed VM-Entry inadvertantly blocked KVM's handling of machine checks
> that occur during VM-Entry.
> 
> Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways,
> depending on when the #MC is recognoized.  As it pertains to this bug
> fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY
> is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to
> indicate the VM-Entry failed.
> 
> If a machine-check event occurs during a VM entry, one of the following occurs:
>  - The machine-check event is handled as if it occurred before the VM entry:
>         ...
>  - The machine-check event is handled after VM entry completes:
>         ...
>  - A VM-entry failure occurs as described in Section 26.7. The basic
>    exit reason is 41, for "VM-entry failure due to machine-check event".
> 
> Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in
> vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit().
> Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY
> in a sane fashion and also simplifies vmx_complete_atomic_exit() since
> VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh.
> 
> Fixes: b060ca3b2e9e7 ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 20 ++++++++------------
>  1 file changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d8f101b58ab8..79ce9c7062f9 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6103,28 +6103,21 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
>  
>  static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
>  {
> -	u32 exit_intr_info = 0;
> -	u16 basic_exit_reason = (u16)vmx->exit_reason;
> -
> -	if (!(basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY
> -	      || basic_exit_reason == EXIT_REASON_EXCEPTION_NMI))
> +	if (vmx->exit_reason != EXIT_REASON_EXCEPTION_NMI)
>  		return;
>  
> -	if (!(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
> -		exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> -	vmx->exit_intr_info = exit_intr_info;
> +	vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
>  
>  	/* if exit due to PF check for async PF */
> -	if (is_page_fault(exit_intr_info))
> +	if (is_page_fault(vmx->exit_intr_info))
>  		vmx->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
>  
>  	/* Handle machine checks before interrupts are enabled */
> -	if (basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY ||
> -	    is_machine_check(exit_intr_info))
> +	if (is_machine_check(vmx->exit_intr_info))
>  		kvm_machine_check();
>  
>  	/* We need to handle NMIs before interrupts are enabled */
> -	if (is_nmi(exit_intr_info)) {
> +	if (is_nmi(vmx->exit_intr_info)) {
>  		kvm_before_interrupt(&vmx->vcpu);
>  		asm("int $2");
>  		kvm_after_interrupt(&vmx->vcpu);

This is indeed cleaner in addition to fixing the bug.  I'm also applying this

-------------- 8< --------------
Subject: [PATCH] kvm: nVMX: small cleanup in handle_exception
From: Paolo Bonzini <pbonzini@redhat.com>

The reason for skipping handling of NMI and #MC in handle_exception is
the same, namely they are handled earlier by vmx_complete_atomic_exit.
Calling the machine check handler (which just returns 1) is misleading,
don't do it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1b3ca0582a0c..da6c829bad9f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4455,11 +4455,8 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 	vect_info = vmx->idt_vectoring_info;
 	intr_info = vmx->exit_intr_info;
 
-	if (is_machine_check(intr_info))
-		return handle_machine_check(vcpu);
-
-	if (is_nmi(intr_info))
-		return 1;  /* already handled by vmx_vcpu_run() */
+	if (is_machine_check(intr_info) || is_nmi(intr_info))
+		return 1;  /* already handled by vmx_complete_atomic_exit */
 
 	if (is_invalid_opcode(intr_info))
 		return handle_ud(vcpu);


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt
  2019-04-20  5:50 ` [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt Sean Christopherson
@ 2019-06-06 13:02   ` Paolo Bonzini
  2019-06-06 14:09     ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2019-06-06 13:02 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář, Joerg Roedel
  Cc: kvm, Jim Mattson

On 20/04/19 07:50, Sean Christopherson wrote:
> Generic x86 code blindly invokes the dedicated external interrupt
> handler blindly, i.e. vmx_handle_external_intr() is called on all
> VM-Exits regardless of the actual exit type.

That's *really* blindly. :)  Rephrased to

    Generic x86 code invokes the kvm_x86_ops external interrupt handler
    on all VM-Exits regardless of the actual exit type.

-		unsigned long entry;
-		gate_desc *desc;
+	unsigned long entry;

I'd rather keep the desc variable to simplify review (with "diff -b")
and because the code is more readable that way.  Unless you have a
strong reason not to do so, I can do the change when applying.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn
  2019-04-20  5:50 ` [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn Sean Christopherson
@ 2019-06-06 13:20   ` Paolo Bonzini
  2019-06-06 15:14     ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2019-06-06 13:20 UTC (permalink / raw)
  To: Sean Christopherson, Radim Krčmář, Joerg Roedel
  Cc: kvm, Jim Mattson

On 20/04/19 07:50, Sean Christopherson wrote:
> Per commit 1b6269db3f833 ("KVM: VMX: Handle NMIs before enabling
> interrupts and preemption"), NMIs are handled directly in vmx_vcpu_run()
> to "make sure we handle NMI on the current cpu, and that we don't
> service maskable interrupts before non-maskable ones".  The other
> exceptions handled by complete_atomic_exit(), e.g. async #PF and #MC,
> have similar requirements, and are located there to avoid extra VMREADs
> since VMX bins hardware exceptions and NMIs into a single exit reason.
> 
> Clean up the code and eliminate the vaguely named complete_atomic_exit()
> by moving the interrupts-disabled exception and NMI handling into the
> existing handle_external_intrs() callback, and rename the callback to
> a more appropriate name.
> 
> In addition to improving code readability, this also ensures the NMI
> handler is run with the host's debug registers loaded in the unlikely
> event that the user is debugging NMIs.  Accuracy of the last_guest_tsc
> field is also improved when handling NMIs (and #MCs) as the handler
> will run after updating said field.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Very nice, just some changes I'd like to propose. "atomic" is Linux 
lingo for "irqs disabled", so I'd like to rename the handler to 
handle_exit_atomic so it has a correspondance with handle_exit.  
Likewise we could have handle_exception_nmi_atomic and 
handle_external_interrupt_atomic.

Putting everything together we get:

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 35e7937cc9ac..b7d5935c1637 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1117,7 +1117,7 @@ struct kvm_x86_ops {
 	int (*check_intercept)(struct kvm_vcpu *vcpu,
 			       struct x86_instruction_info *info,
 			       enum x86_intercept_stage stage);
-	void (*handle_external_intr)(struct kvm_vcpu *vcpu);
+	void (*handle_exit_atomic)(struct kvm_vcpu *vcpu);
 	bool (*mpx_supported)(void);
 	bool (*xsaves_supported)(void);
 	bool (*umip_emulated)(void);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index acc09e9fc173..9c6458e60558 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6172,7 +6172,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
 	return ret;
 }
 
-static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
+static void svm_handle_exit_atomic(struct kvm_vcpu *vcpu)
 {
 	kvm_before_interrupt(vcpu);
 	local_irq_enable();
@@ -7268,7 +7268,7 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	.set_tdp_cr3 = set_tdp_cr3,
 
 	.check_intercept = svm_check_intercept,
-	.handle_external_intr = svm_handle_external_intr,
+	.handle_exit_atomic = svm_handle_exit_atomic,
 
 	.request_immediate_exit = __kvm_request_immediate_exit,
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 963c8c409223..dfaa770b9bb3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4437,11 +4437,11 @@ static void kvm_machine_check(void)
 
 static int handle_machine_check(struct kvm_vcpu *vcpu)
 {
-	/* already handled by vcpu_run */
+	/* handled by vmx_vcpu_run() */
 	return 1;
 }
 
-static int handle_exception(struct kvm_vcpu *vcpu)
+static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct kvm_run *kvm_run = vcpu->run;
@@ -4454,7 +4454,7 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 	intr_info = vmx->exit_intr_info;
 
 	if (is_machine_check(intr_info) || is_nmi(intr_info))
-		return 1;  /* already handled by vmx_complete_atomic_exit */
+		return 1; /* handled by handle_exception_nmi_atomic() */
 
 	if (is_invalid_opcode(intr_info))
 		return handle_ud(vcpu);
@@ -5462,7 +5462,7 @@ static int handle_encls(struct kvm_vcpu *vcpu)
  * to be done to userspace and return 0.
  */
 static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
-	[EXIT_REASON_EXCEPTION_NMI]           = handle_exception,
+	[EXIT_REASON_EXCEPTION_NMI]           = handle_exception_nmi,
 	[EXIT_REASON_EXTERNAL_INTERRUPT]      = handle_external_interrupt,
 	[EXIT_REASON_TRIPLE_FAULT]            = handle_triple_fault,
 	[EXIT_REASON_NMI_WINDOW]	      = handle_nmi_window,
@@ -6100,11 +6100,8 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
 	memset(vmx->pi_desc.pir, 0, sizeof(vmx->pi_desc.pir));
 }
 
-static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
+static void handle_exception_nmi_atomic(struct vcpu_vmx *vmx)
 {
-	if (vmx->exit_reason != EXIT_REASON_EXCEPTION_NMI)
-		return;
-
 	vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 
 	/* if exit due to PF check for async PF */
@@ -6123,7 +6120,7 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 	}
 }
 
-static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
+static void handle_external_interrupt_atomic(struct kvm_vcpu *vcpu)
 {
 	unsigned int vector;
 	unsigned long entry;
@@ -6133,9 +6130,6 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 	gate_desc *desc;
 	u32 intr_info;
 
-	if (to_vmx(vcpu)->exit_reason != EXIT_REASON_EXTERNAL_INTERRUPT)
-		return;
-
 	intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 	if (WARN_ONCE(!is_external_intr(intr_info),
 	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
@@ -6170,7 +6164,17 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 
 	kvm_after_interrupt(vcpu);
 }
-STACK_FRAME_NON_STANDARD(vmx_handle_external_intr);
+STACK_FRAME_NON_STANDARD(handle_external_interrupt_atomic);
+
+static void vmx_handle_exit_atomic(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (vmx->exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+		handle_external_interrupt_atomic(vcpu);
+	else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
+		handle_exception_nmi_atomic(vmx);
+}
 
 static bool vmx_has_emulated_msr(int index)
 {
@@ -6540,7 +6544,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	vmx->loaded_vmcs->launched = 1;
 	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
-	vmx_complete_atomic_exit(vmx);
 	vmx_recover_nmi_blocking(vmx);
 	vmx_complete_interrupts(vmx);
 }
@@ -7694,7 +7697,7 @@ static __exit void hardware_unsetup(void)
 	.set_tdp_cr3 = vmx_set_cr3,
 
 	.check_intercept = vmx_check_intercept,
-	.handle_external_intr = vmx_handle_external_intr,
+	.handle_exit_atomic = vmx_handle_exit_atomic,
 	.mpx_supported = vmx_mpx_supported,
 	.xsaves_supported = vmx_xsaves_supported,
 	.umip_emulated = vmx_umip_emulated,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6e2f53cd8ea8..88489af13e96 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7999,7 +7999,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
 
-	kvm_x86_ops->handle_external_intr(vcpu);
+	kvm_x86_ops->handle_exit_atomic(vcpu);
 
 	++vcpu->stat.exits;
 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt
  2019-06-06 13:02   ` Paolo Bonzini
@ 2019-06-06 14:09     ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2019-06-06 14:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Radim Krčmář, Joerg Roedel, kvm, Jim Mattson

On Thu, Jun 06, 2019 at 03:02:02PM +0200, Paolo Bonzini wrote:
> On 20/04/19 07:50, Sean Christopherson wrote:
> > Generic x86 code blindly invokes the dedicated external interrupt
> > handler blindly, i.e. vmx_handle_external_intr() is called on all
> > VM-Exits regardless of the actual exit type.
> 
> That's *really* blindly. :)  Rephrased to

Hmm, I must not have seen the first one.

>     Generic x86 code invokes the kvm_x86_ops external interrupt handler
>     on all VM-Exits regardless of the actual exit type.
> 
> -		unsigned long entry;
> -		gate_desc *desc;
> +	unsigned long entry;
> 
> I'd rather keep the desc variable to simplify review (with "diff -b")
> and because the code is more readable that way.  Unless you have a
> strong reason not to do so, I can do the change when applying.

No strong reason, I found the code to be more readable without it :-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn
  2019-06-06 13:20   ` Paolo Bonzini
@ 2019-06-06 15:14     ` Sean Christopherson
  2019-06-07 11:40       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2019-06-06 15:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Radim Krčmář, Joerg Roedel, kvm, Jim Mattson

On Thu, Jun 06, 2019 at 03:20:49PM +0200, Paolo Bonzini wrote:
> On 20/04/19 07:50, Sean Christopherson wrote:
> > Per commit 1b6269db3f833 ("KVM: VMX: Handle NMIs before enabling
> > interrupts and preemption"), NMIs are handled directly in vmx_vcpu_run()
> > to "make sure we handle NMI on the current cpu, and that we don't
> > service maskable interrupts before non-maskable ones".  The other
> > exceptions handled by complete_atomic_exit(), e.g. async #PF and #MC,
> > have similar requirements, and are located there to avoid extra VMREADs
> > since VMX bins hardware exceptions and NMIs into a single exit reason.
> > 
> > Clean up the code and eliminate the vaguely named complete_atomic_exit()
> > by moving the interrupts-disabled exception and NMI handling into the
> > existing handle_external_intrs() callback, and rename the callback to
> > a more appropriate name.
> > 
> > In addition to improving code readability, this also ensures the NMI
> > handler is run with the host's debug registers loaded in the unlikely
> > event that the user is debugging NMIs.  Accuracy of the last_guest_tsc
> > field is also improved when handling NMIs (and #MCs) as the handler
> > will run after updating said field.
> > 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Very nice, just some changes I'd like to propose. "atomic" is Linux 
> lingo for "irqs disabled", so I'd like to rename the handler to 

The code disagrees, e.g.

  /*
   * Are we running in atomic context?  WARNING: this macro cannot
   * always detect atomic context; in particular, it cannot know about
   * held spinlocks in non-preemptible kernels.  Thus it should not be
   * used in the general case to determine whether sleeping is possible.
   * Do not use in_atomic() in driver code.
   */
  #define in_atomic()	(preempt_count() != 0)

and

  void ___might_sleep(...)
  {
	...

	printk(KERN_ERR
		"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
			in_atomic(), irqs_disabled(),
			current->pid, current->comm);
  }

and

  static inline void *kmap_atomic(struct page *page)
  {
	preempt_disable();
	pagefault_disable();
	return page_address(page);
  }

My interpretation of things is that the kernel's definition of an atomic
context is with respect to preemption.  Disabling IRQs would also provide
atomicity, but the reverse is not true, i.e. entering an atomic context
does not imply IRQs are disabled.

As it pertains to KVM, we specifically care about IRQs being disabled,
e.g. VMX needs to ensure #MC and NMI are handled before any pending IRQs,
and both VMX and SVM need to ensure a pending perf interrupt is handled
in the callback.

And if "atomic" is interpreted as "IRQs disabled", one could argue that
the SVM behavior is buggy since enabling IRQs would break atomicity.

> handle_exit_atomic so it has a correspondance with handle_exit.  
> Likewise we could have handle_exception_nmi_atomic and 
> handle_external_interrupt_atomic.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn
  2019-06-06 15:14     ` Sean Christopherson
@ 2019-06-07 11:40       ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2019-06-07 11:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Radim Krčmář, Joerg Roedel, kvm, Jim Mattson

On 06/06/19 17:14, Sean Christopherson wrote:
> The code disagrees, e.g.
> 
>   /*
>    * Are we running in atomic context?  WARNING: this macro cannot
>    * always detect atomic context; in particular, it cannot know about
>    * held spinlocks in non-preemptible kernels.  Thus it should not be
>    * used in the general case to determine whether sleeping is possible.
>    * Do not use in_atomic() in driver code.
>    */
>   #define in_atomic()	(preempt_count() != 0)

You're totally right.  "_irqoff" seems to be the common suffix for
irq-disabled functions.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-06-07 11:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-20  5:50 [PATCH 0/5] KVM: VMX: INTR, NMI and #MC cleanup Sean Christopherson
2019-04-20  5:50 ` [PATCH 1/5] KVM: VMX: Fix handling of #MC that occurs during VM-Entry Sean Christopherson
2019-06-06 12:57   ` Paolo Bonzini
2019-04-20  5:50 ` [PATCH 2/5] KVM: VMX: Read cached VM-Exit reason to detect external interrupt Sean Christopherson
2019-06-06 13:02   ` Paolo Bonzini
2019-06-06 14:09     ` Sean Christopherson
2019-04-20  5:50 ` [PATCH 3/5] KVM: VMX: Store the host kernel's IDT base in a global variable Sean Christopherson
2019-04-20 14:17   ` [RFC PATCH] KVM: VMX: host_idt_base can be static kbuild test robot
2019-04-20  5:50 ` [PATCH 4/5] KVM: x86: Move kvm_{before,after}_interrupt() calls to vendor code Sean Christopherson
2019-04-20  5:50 ` [PATCH 5/5] KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn Sean Christopherson
2019-06-06 13:20   ` Paolo Bonzini
2019-06-06 15:14     ` Sean Christopherson
2019-06-07 11:40       ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.