kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lai Jiangshan <jiangshanlai@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Lai Jiangshan <laijs@linux.alibaba.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andi Kleen <ak@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, Josh Poimboeuf <jpoimboe@redhat.com>,
	Uros Bizjak <ubizjak@gmail.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH 3/4] KVM/VMX: Invoke NMI non-IST entry instead of IST entry
Date: Tue, 27 Apr 2021 07:09:48 +0800	[thread overview]
Message-ID: <20210426230949.3561-4-jiangshanlai@gmail.com> (raw)
In-Reply-To: <20210426230949.3561-1-jiangshanlai@gmail.com>

From: Lai Jiangshan <laijs@linux.alibaba.com>

In VMX, the NMI handler needs to be invoked after NMI VM-Exit.

Before the commit 1a5488ef0dcf6 ("KVM: VMX: Invoke NMI handler via
indirect call instead of INTn"), the work is done by INTn ("int $2").

But INTn microcode is relatively expensive, so the commit reworked
NMI VM-Exit handling to invoke the kernel handler by function call.
And INTn doesn't set the NMI blocked flag required by the linux kernel
NMI entry.  So moving away from INTn are very reasonable.

Yet some details were missed.  After the said commit applied, the NMI
entry pointer is fetched from the IDT table and called from the kernel
stack.  But the NMI entry pointer installed on the IDT table is
asm_exc_nmi() which expects to be invoked on the IST stack by the ISA.
And it relies on the "NMI executing" variable on the IST stack to work
correctly.  When it is unexpectedly called from the kernel stack, the
RSP-located "NMI executing" variable is also on the kernel stack and
is "uninitialized" and can cause the NMI entry to run in the wrong way.

So we should not used the NMI entry installed on the IDT table.  Rather,
we should use the NMI entry allowed to be used on the kernel stack which
is asm_noist_exc_nmi() which is also used for XENPV and early booting.

Link: https://lore.kernel.org/lkml/20200915191505.10355-3-sean.j.christopherson@intel.com/
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: kvm@vger.kernel.org
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
 arch/x86/kernel/nmi.c  | 3 +++
 arch/x86/kvm/vmx/vmx.c | 8 ++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 2fb1fd59d714..919f0400d931 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -528,10 +528,13 @@ DEFINE_IDTENTRY_RAW(noist_exc_nmi)
 	 * On Xen PV and early booting stage, NMI doesn't use IST.
+	 * And when it is manually called from VMX NMI VM-Exit handler,
+	 * it doesn't use IST either.
 	 * The C part is the same as native.
 void stop_nmi(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bcbf0d2139e9..96e59d912637 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -36,6 +36,7 @@
 #include <asm/debugreg.h>
 #include <asm/desc.h>
 #include <asm/fpu/internal.h>
+#include <asm/idtentry.h>
 #include <asm/io.h>
 #include <asm/irq_remapping.h>
 #include <asm/kexec.h>
@@ -6416,8 +6417,11 @@ static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx)
 	else if (is_machine_check(intr_info))
 	/* We need to handle NMIs before interrupts are enabled */
-	else if (is_nmi(intr_info))
-		handle_interrupt_nmi_irqoff(&vmx->vcpu, intr_info);
+	else if (is_nmi(intr_info)) {
+		kvm_before_interrupt(&vmx->vcpu);
+		vmx_do_interrupt_nmi_irqoff((unsigned long)asm_noist_exc_nmi);
+		kvm_after_interrupt(&vmx->vcpu);
+	}
 static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)

  parent reply	other threads:[~2021-04-27  8:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26 23:09 [PATCH 0/4] x86: Don't invoke asm_exc_nmi() on the kernel stack Lai Jiangshan
2021-04-26 23:09 ` [PATCH 1/4] x86/xen/entry: Rename xenpv_exc_nmi to noist_exc_nmi Lai Jiangshan
2021-04-28 21:27   ` Steven Rostedt
2021-04-30  7:15     ` Paolo Bonzini
2021-04-30 12:05       ` Steven Rostedt
2021-05-03 19:05   ` Thomas Gleixner
2021-05-03 19:41     ` Thomas Gleixner
2021-05-10  7:59   ` Juergen Gross
2021-04-26 23:09 ` Lai Jiangshan [this message]
2021-04-30  2:46   ` [PATCH 3/4] KVM/VMX: Invoke NMI non-IST entry instead of IST entry Lai Jiangshan
2021-05-03 19:37   ` Thomas Gleixner
2021-05-03 20:02   ` Thomas Gleixner
2021-05-04  8:10     ` Paolo Bonzini
2021-04-26 23:09 ` [PATCH 4/4] KVM/VMX: Fold handle_interrupt_nmi_irqoff() into its solo caller Lai Jiangshan
2021-04-30  9:03   ` Thomas Gleixner
2021-04-30  9:06     ` Paolo Bonzini
2021-04-30 23:28       ` Thomas Gleixner
2021-04-30  7:14 ` [PATCH 0/4] x86: Don't invoke asm_exc_nmi() on the kernel stack Paolo Bonzini
2021-05-03 14:36   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210426230949.3561-4-jiangshanlai@gmail.com \
    --to=jiangshanlai@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=ubizjak@gmail.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).