kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug
@ 2021-06-09 18:56 Sean Christopherson
  2021-06-09 18:56 ` [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Sean Christopherson
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Fix a NULL pointer dereference in gfn_to_rmap() that occurs if RSM fails,
reported by syzbot.

The immediate problem is that the MMU context's role gets out of sync
because KVM clears the SMM flag in the vCPU at the start of RSM emulation,
but only resets the MMU context if RSM succeeds.  The divergence in vCPU
vs. MMU role with respect to the SMM flag causes explosions if the non-SMM
memslots have gfn ranges that are not present in the SMM memslots, because
the MMU expects that the memslot for a shadow page cannot magically
disappear.

The other obvious problem is that KVM doesn't emulate triple fault on RSM
failure, e.g. it keeps running the vCPU in a frankenstate instead of
exiting to userspace.  Fixing that would squash the syzbot repro, but
would not fix the underlying issue because nothing prevents userspace from
calling KVM_RUN on a vCPU that hit shutdown (yay lack of a shutdown state).
But, it's easy to fix and definitely worth doing.

Everything after the two bug fixes is cleanup.

Ben Gardon has an internal patch or two that guards against the NULL
pointer dereference in gfn_to_rmap().  I'm planning on getting that
functionality posted (needs a little massaging) so that these types of
snafus don't crash the host (this isn't the first time I've introduced a
bug that broke gfn_to_rmap(), though thankfully it's the first time such
a bug has made it upstream, knock on wood).

Amusingly, adding gfn_to_rmap() NULL memslot checks might even be a
performance improvement.  Because gfn_to_rmap() doesn't check the memslot
before using it, and because the compiler can see the search_memslots()
returns NULL/0, gcc often/always generates dedicated (and hilarious) code
for NULL, e.g. this #GP was caused by an explicit load from 0:

  48 8b 14 25 00 00 00 00	mov    0x0,%rdx


Sean Christopherson (9):
  KVM: x86: Immediately reset the MMU context when the SMM flag is
    cleared
  KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper
  KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag
  KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed()
  KVM: x86: Move "entering SMM" tracepoint into kvm_smm_changed()
  KVM: x86: Rename SMM tracepoint to make it reflect reality
  KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset
  KVM: x86: Drop "pre_" from enter/leave_smm() helpers

 arch/x86/include/asm/kvm-x86-ops.h |  4 +--
 arch/x86/include/asm/kvm_host.h    |  4 +--
 arch/x86/kvm/emulate.c             | 31 ++++++++++-------
 arch/x86/kvm/kvm_emulate.h         |  7 ++--
 arch/x86/kvm/svm/svm.c             |  8 ++---
 arch/x86/kvm/trace.h               |  2 +-
 arch/x86/kvm/vmx/vmx.c             |  8 ++---
 arch/x86/kvm/x86.c                 | 53 +++++++++++++++---------------
 8 files changed, 61 insertions(+), 56 deletions(-)

-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-10 13:19   ` Paolo Bonzini
  2021-06-09 18:56 ` [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails Sean Christopherson
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Immediately reset the MMU context when the vCPU's SMM flag is cleared so
that the SMM flag in the MMU role is always synchronized with the vCPU's
flag.  If RSM fails (which isn't correctly emulated), KVM will bail
without calling post_leave_smm() and leave the MMU in a bad state.

The bad MMU role can lead to a NULL pointer dereference when grabbing a
shadow page's rmap for a page fault as the initial lookups for the gfn
will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will
use the shadow page's SMM flag, which comes from the MMU (=1).  SMM has
an entirely different set of memslots, and so the initial lookup can find
a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1).

  general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline]
  RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947
  Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44
  RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
  RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002
  R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000
  R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000
  FS:  000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline]
   mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604
   __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline]
   direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769
   kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline]
   kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065
   vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122
   vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428
   vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494
   kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722
   kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:1069 [inline]
   __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055
   do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x440ce9

Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com
Fixes: 9ec19493fb86 ("KVM: x86: clear SMM flags before loading state while leaving SMM")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9dd23bdfc6cc..54d212fe9b15 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7106,7 +7106,10 @@ static unsigned emulator_get_hflags(struct x86_emulate_ctxt *ctxt)
 
 static void emulator_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned emul_flags)
 {
-	emul_to_vcpu(ctxt)->arch.hflags = emul_flags;
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	vcpu->arch.hflags = emul_flags;
+	kvm_mmu_reset_context(vcpu);
 }
 
 static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
  2021-06-09 18:56 ` [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-10  8:26   ` Vitaly Kuznetsov
  2021-06-10 13:23   ` Paolo Bonzini
  2021-06-09 18:56 ` [PATCH 3/9] KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper Sean Christopherson
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Use the recently introduced KVM_REQ_TRIPLE_FAULT to properly emulate
shutdown if RSM from SMM fails.

Note, entering shutdown after clearing the SMM flag and restoring NMI
blocking is architecturally correct with respect to AMD's APM, which KVM
also uses for SMRAM layout and RSM NMI blocking behavior.  The APM says:

  An RSM causes a processor shutdown if an invalid-state condition is
  found in the SMRAM state-save area. Only an external reset, external
  processor-initialization, or non-maskable external interrupt (NMI) can
  cause the processor to leave the shutdown state.

Of note is processor-initialization (INIT) as a valid shutdown wake
event, as INIT is blocked by SMM, implying that entering shutdown also
forces the CPU out of SMM.

For recent Intel CPUs, restoring NMI blocking is technically wrong, but
so is restoring NMI blocking in the first place, and Intel's RSM
"architecture" is such a mess that just about anything is allowed and can
be justified as micro-architectural behavior.

Per the SDM:

  On Pentium 4 and later processors, shutdown will inhibit INTR and A20M
  but will not change any of the other inhibits. On these processors,
  NMIs will be inhibited if no action is taken in the SMI handler to
  uninhibit them (see Section 34.8).

where Section 34.8 says:

  When the processor enters SMM while executing an NMI handler, the
  processor saves the SMRAM state save map but does not save the
  attribute to keep NMI interrupts disabled. Potentially, an NMI could be
  latched (while in SMM or upon exit) and serviced upon exit of SMM even
  though the previous NMI handler has still not completed.

I.e. RSM unconditionally unblocks NMI, but shutdown on RSM does not,
which is in direct contradiction of KVM's behavior.  But, as mentioned
above, KVM follows AMD architecture and restores NMI blocking on RSM, so
that micro-architectural detail is already lost.

And for Pentium era CPUs, SMI# can break shutdown, meaning that at least
some Intel CPUs fully leave SMM when entering shutdown:

  In the shutdown state, Intel processors stop executing instructions
  until a RESET#, INIT# or NMI# is asserted.  While Pentium family
  processors recognize the SMI# signal in shutdown state, P6 family and
  Intel486 processors do not.

In other words, the fact that Intel CPUs have implemented the two
extremes gives KVM carte blanche when it comes to honoring Intel's
architecture for handling shutdown during RSM.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/emulate.c     | 12 +++++++-----
 arch/x86/kvm/kvm_emulate.h |  1 +
 arch/x86/kvm/x86.c         |  6 ++++++
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5e5de05a8fbf..0603a2c79093 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2683,7 +2683,7 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 	 * state-save area.
 	 */
 	if (ctxt->ops->pre_leave_smm(ctxt, buf))
-		return X86EMUL_UNHANDLEABLE;
+		goto emulate_shutdown;
 
 #ifdef CONFIG_X86_64
 	if (emulator_has_longmode(ctxt))
@@ -2692,14 +2692,16 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 #endif
 		ret = rsm_load_state_32(ctxt, buf);
 
-	if (ret != X86EMUL_CONTINUE) {
-		/* FIXME: should triple fault */
-		return X86EMUL_UNHANDLEABLE;
-	}
+	if (ret != X86EMUL_CONTINUE)
+		goto emulate_shutdown;
 
 	ctxt->ops->post_leave_smm(ctxt);
 
 	return X86EMUL_CONTINUE;
+
+emulate_shutdown:
+	ctxt->ops->triple_fault(ctxt);
+	return X86EMUL_UNHANDLEABLE;
 }
 
 static void
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 3e870bf9ca4d..9c34aa60e45f 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -233,6 +233,7 @@ struct x86_emulate_ops {
 	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
 			     const char *smstate);
 	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
+	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
 };
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 54d212fe9b15..cda148cf06fa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7123,6 +7123,11 @@ static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
 	kvm_smm_changed(emul_to_vcpu(ctxt));
 }
 
+static void emulator_triple_fault(struct x86_emulate_ctxt *ctxt)
+{
+	kvm_make_request(KVM_REQ_TRIPLE_FAULT, emul_to_vcpu(ctxt));
+}
+
 static int emulator_set_xcr(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr)
 {
 	return __kvm_set_xcr(emul_to_vcpu(ctxt), index, xcr);
@@ -7172,6 +7177,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.set_hflags          = emulator_set_hflags,
 	.pre_leave_smm       = emulator_pre_leave_smm,
 	.post_leave_smm      = emulator_post_leave_smm,
+	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
 };
 
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/9] KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
  2021-06-09 18:56 ` [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Sean Christopherson
  2021-06-09 18:56 ` [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 4/9] KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag Sean Christopherson
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Replace the .set_hflags() emulator hook with a dedicated .exiting_smm(),
moving the SMM and SMM_INSIDE_NMI flag handling out of the emulator in
the process.  This is a step towards consolidating much of the logic in
kvm_smm_changed(), including the SMM hflags updates.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/emulate.c     | 3 +--
 arch/x86/kvm/kvm_emulate.h | 2 +-
 arch/x86/kvm/x86.c         | 6 +++---
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0603a2c79093..9e0d5900c011 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2638,8 +2638,7 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 	if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
 		ctxt->ops->set_nmi_mask(ctxt, false);
 
-	ctxt->ops->set_hflags(ctxt, ctxt->ops->get_hflags(ctxt) &
-		~(X86EMUL_SMM_INSIDE_NMI_MASK | X86EMUL_SMM_MASK));
+	ctxt->ops->exiting_smm(ctxt);
 
 	/*
 	 * Get back to real mode, to prepare a safe state in which to load
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 9c34aa60e45f..b620782c1fb5 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -229,7 +229,7 @@ struct x86_emulate_ops {
 	void (*set_nmi_mask)(struct x86_emulate_ctxt *ctxt, bool masked);
 
 	unsigned (*get_hflags)(struct x86_emulate_ctxt *ctxt);
-	void (*set_hflags)(struct x86_emulate_ctxt *ctxt, unsigned hflags);
+	void (*exiting_smm)(struct x86_emulate_ctxt *ctxt);
 	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
 			     const char *smstate);
 	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cda148cf06fa..69a71c5019c1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7104,11 +7104,11 @@ static unsigned emulator_get_hflags(struct x86_emulate_ctxt *ctxt)
 	return emul_to_vcpu(ctxt)->arch.hflags;
 }
 
-static void emulator_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned emul_flags)
+static void emulator_exiting_smm(struct x86_emulate_ctxt *ctxt)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 
-	vcpu->arch.hflags = emul_flags;
+	vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
 	kvm_mmu_reset_context(vcpu);
 }
 
@@ -7174,7 +7174,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.guest_has_fxsr      = emulator_guest_has_fxsr,
 	.set_nmi_mask        = emulator_set_nmi_mask,
 	.get_hflags          = emulator_get_hflags,
-	.set_hflags          = emulator_set_hflags,
+	.exiting_smm         = emulator_exiting_smm,
 	.pre_leave_smm       = emulator_pre_leave_smm,
 	.post_leave_smm      = emulator_post_leave_smm,
 	.triple_fault        = emulator_triple_fault,
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/9] KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 3/9] KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 5/9] KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed() Sean Christopherson
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Move RSM emulation's call to kvm_smm_changed() from .post_leave_smm() to
.exiting_smm(), leaving behind the MMU context reset.  The primary
motivation is to allow for future cleanup, but this also fixes a bug of
sorts by queueing KVM_REQ_EVENT even if RSM causes shutdown, e.g. to let
an INIT wake the vCPU from shutdown.  Of course, KVM doesn't properly
emulate a shutdown state, e.g. KVM doesn't block SMIs after shutdown, and
immediately exits to userspace, so the event request is a moot point in
practice.

Moving kvm_smm_changed() also moves the RSM tracepoint.  This isn't
strictly necessary, but will allow consolidating the SMI and RSM
tracepoints in a future commit (by also moving the SMI tracepoint).
Invoking the tracepoint before loading SMRAM state also means the SMBASE
that reported in the tracepoint will point that the state that will be
used for RSM, as opposed to the SMBASE _after_ RSM completes, which is
arguably a good thing if the tracepoint is being used to debug a RSM/SMM
issue.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 69a71c5019c1..76ba28865824 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7109,7 +7109,7 @@ static void emulator_exiting_smm(struct x86_emulate_ctxt *ctxt)
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 
 	vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
-	kvm_mmu_reset_context(vcpu);
+	kvm_smm_changed(vcpu);
 }
 
 static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
@@ -7120,7 +7120,7 @@ static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
 
 static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
 {
-	kvm_smm_changed(emul_to_vcpu(ctxt));
+	kvm_mmu_reset_context(emul_to_vcpu(ctxt));
 }
 
 static void emulator_triple_fault(struct x86_emulate_ctxt *ctxt)
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/9] KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed()
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (3 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 4/9] KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 6/9] KVM: x86: Move "entering SMM" tracepoint " Sean Christopherson
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Move the core of SMM hflags modifications into kvm_smm_changed() and use
kvm_smm_changed() in enter_smm().  Clear HF_SMM_INSIDE_NMI_MASK for
leaving SMM but do not set it for entering SMM.  If the vCPU is executing
outside of SMM, the flag should unequivocally be cleared, e.g. this
technically fixes a benign bug where the flag could be left set after
KVM_SET_VCPU_EVENTS, but the reverse is not true as NMI blocking depends
on pre-SMM state or userspace input.

Note, this adds an extra kvm_mmu_reset_context() to enter_smm().  The
extra/early reset isn't strictly necessary, and in a way can never be
necessary since the vCPU/MMU context is in a half-baked state until the
final context reset at the end of the function.  But, enter_smm() is not
a hot path, and exploding on an invalid root_hpa is probably better than
having a stale SMM flag in the MMU role; it's at least no worse.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76ba28865824..13a33c962657 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4457,7 +4457,7 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
 	memset(&events->reserved, 0, sizeof(events->reserved));
 }
 
-static void kvm_smm_changed(struct kvm_vcpu *vcpu);
+static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm);
 
 static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 					      struct kvm_vcpu_events *events)
@@ -4517,13 +4517,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 		vcpu->arch.apic->sipi_vector = events->sipi_vector;
 
 	if (events->flags & KVM_VCPUEVENT_VALID_SMM) {
-		if (!!(vcpu->arch.hflags & HF_SMM_MASK) != events->smi.smm) {
-			if (events->smi.smm)
-				vcpu->arch.hflags |= HF_SMM_MASK;
-			else
-				vcpu->arch.hflags &= ~HF_SMM_MASK;
-			kvm_smm_changed(vcpu);
-		}
+		if (!!(vcpu->arch.hflags & HF_SMM_MASK) != events->smi.smm)
+			kvm_smm_changed(vcpu, events->smi.smm);
 
 		vcpu->arch.smi_pending = events->smi.pending;
 
@@ -7108,8 +7103,7 @@ static void emulator_exiting_smm(struct x86_emulate_ctxt *ctxt)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 
-	vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
-	kvm_smm_changed(vcpu);
+	kvm_smm_changed(vcpu, false);
 }
 
 static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
@@ -7438,9 +7432,13 @@ static bool retry_instruction(struct x86_emulate_ctxt *ctxt,
 static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
 static int complete_emulated_pio(struct kvm_vcpu *vcpu);
 
-static void kvm_smm_changed(struct kvm_vcpu *vcpu)
+static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm)
 {
-	if (!(vcpu->arch.hflags & HF_SMM_MASK)) {
+	if (entering_smm) {
+		vcpu->arch.hflags |= HF_SMM_MASK;
+	} else {
+		vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
+
 		/* This is a good place to trace that we are exiting SMM.  */
 		trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, false);
 
@@ -8912,7 +8910,7 @@ static void enter_smm(struct kvm_vcpu *vcpu)
 	 */
 	static_call(kvm_x86_pre_enter_smm)(vcpu, buf);
 
-	vcpu->arch.hflags |= HF_SMM_MASK;
+	kvm_smm_changed(vcpu, true);
 	kvm_vcpu_write_guest(vcpu, vcpu->arch.smbase + 0xfe00, buf, sizeof(buf));
 
 	if (static_call(kvm_x86_get_nmi_mask)(vcpu))
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/9] KVM: x86: Move "entering SMM" tracepoint into kvm_smm_changed()
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (4 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 5/9] KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed() Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 7/9] KVM: x86: Rename SMM tracepoint to make it reflect reality Sean Christopherson
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Invoke the "entering SMM" tracepoint from kvm_smm_changed() instead of
enter_smm(), effectively moving it from before reading vCPU state to
after reading state (but still before writing it to SMRAM!).  The primary
motivation is to consolidate code, but calling the tracepoint from
kvm_smm_changed() also makes its invocation consistent with respect to
SMI and RSM, and with respect to KVM_SET_VCPU_EVENTS (which previously
only invoked the tracepoint when forcing the vCPU out of SMM).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 13a33c962657..11ea81c8cb82 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7434,14 +7434,13 @@ static int complete_emulated_pio(struct kvm_vcpu *vcpu);
 
 static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm)
 {
+	trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, entering_smm);
+
 	if (entering_smm) {
 		vcpu->arch.hflags |= HF_SMM_MASK;
 	} else {
 		vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
 
-		/* This is a good place to trace that we are exiting SMM.  */
-		trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, false);
-
 		/* Process a latched INIT or SMI, if any.  */
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
 	}
@@ -8894,7 +8893,6 @@ static void enter_smm(struct kvm_vcpu *vcpu)
 	char buf[512];
 	u32 cr0;
 
-	trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, true);
 	memset(buf, 0, 512);
 #ifdef CONFIG_X86_64
 	if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/9] KVM: x86: Rename SMM tracepoint to make it reflect reality
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (5 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 6/9] KVM: x86: Move "entering SMM" tracepoint " Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 8/9] KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset Sean Christopherson
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Rename the SMM tracepoint, which handles both entering and exiting SMM,
from kvm_enter_smm to kvm_smm_transition.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/trace.h | 2 +-
 arch/x86/kvm/x86.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 4f839148948b..b484141ea15b 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -997,7 +997,7 @@ TRACE_EVENT(kvm_wait_lapic_expire,
 		  __entry->delta < 0 ? "early" : "late")
 );
 
-TRACE_EVENT(kvm_enter_smm,
+TRACE_EVENT(kvm_smm_transition,
 	TP_PROTO(unsigned int vcpu_id, u64 smbase, bool entering),
 	TP_ARGS(vcpu_id, smbase, entering),
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 11ea81c8cb82..06f3be2d170b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7434,7 +7434,7 @@ static int complete_emulated_pio(struct kvm_vcpu *vcpu);
 
 static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm)
 {
-	trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, entering_smm);
+	trace_kvm_smm_transition(vcpu->vcpu_id, vcpu->arch.smbase, entering_smm);
 
 	if (entering_smm) {
 		vcpu->arch.hflags |= HF_SMM_MASK;
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 8/9] KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (6 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 7/9] KVM: x86: Rename SMM tracepoint to make it reflect reality Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-09 18:56 ` [PATCH 9/9] KVM: x86: Drop "pre_" from enter/leave_smm() helpers Sean Christopherson
  2021-06-10 13:28 ` [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Paolo Bonzini
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Drop the .post_leave_smm() emulator callback, which at this point is just
a wrapper to kvm_mmu_reset_context().  The manual context reset is
unnecessary, because unlike enter_smm() which calls vendor MSR/CR helpers
directly, em_rsm() bounces through the KVM helpers, e.g. kvm_set_cr4(),
which are responsible for processing side effects.  em_rsm() is already
subtly relying on this behavior as it doesn't manually do
kvm_update_cpuid_runtime(), e.g. to recognize CR4.OSXSAVE changes.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/emulate.c     | 10 ++++++++--
 arch/x86/kvm/kvm_emulate.h |  1 -
 arch/x86/kvm/x86.c         |  6 ------
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9e0d5900c011..34c9f785d715 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2694,8 +2694,14 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 	if (ret != X86EMUL_CONTINUE)
 		goto emulate_shutdown;
 
-	ctxt->ops->post_leave_smm(ctxt);
-
+	/*
+	 * Note, the ctxt->ops callbacks are responsible for handling side
+	 * effects when writing MSRs and CRs, e.g. MMU context resets, CPUID
+	 * runtime updates, etc...  If that changes, e.g. this flow is moved
+	 * out of the emulator to make it look more like enter_smm(), then
+	 * those side effects need to be explicitly handled for both success
+	 * and shutdown.
+	 */
 	return X86EMUL_CONTINUE;
 
 emulate_shutdown:
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index b620782c1fb5..31dc7ca4ff2b 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -232,7 +232,6 @@ struct x86_emulate_ops {
 	void (*exiting_smm)(struct x86_emulate_ctxt *ctxt);
 	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
 			     const char *smstate);
-	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 06f3be2d170b..347849caf1df 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7112,11 +7112,6 @@ static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
 	return static_call(kvm_x86_pre_leave_smm)(emul_to_vcpu(ctxt), smstate);
 }
 
-static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
-{
-	kvm_mmu_reset_context(emul_to_vcpu(ctxt));
-}
-
 static void emulator_triple_fault(struct x86_emulate_ctxt *ctxt)
 {
 	kvm_make_request(KVM_REQ_TRIPLE_FAULT, emul_to_vcpu(ctxt));
@@ -7170,7 +7165,6 @@ static const struct x86_emulate_ops emulate_ops = {
 	.get_hflags          = emulator_get_hflags,
 	.exiting_smm         = emulator_exiting_smm,
 	.pre_leave_smm       = emulator_pre_leave_smm,
-	.post_leave_smm      = emulator_post_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
 };
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 9/9] KVM: x86: Drop "pre_" from enter/leave_smm() helpers
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (7 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 8/9] KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset Sean Christopherson
@ 2021-06-09 18:56 ` Sean Christopherson
  2021-06-10 13:28 ` [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Paolo Bonzini
  9 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-09 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, syzbot+fb0b6a7e8713aeb0319c

Now that .post_leave_smm() is gone, drop "pre_" from the remaining
helpers.  The helpers aren't invoked purely before SMI/RSM processing,
e.g. both helpers are invoked after state is snapshotted (from regs or
SMRAM), and the RSM helper is invoked after some amount of register state
has been stuffed.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  4 ++--
 arch/x86/include/asm/kvm_host.h    |  4 ++--
 arch/x86/kvm/emulate.c             |  6 +++---
 arch/x86/kvm/kvm_emulate.h         |  3 +--
 arch/x86/kvm/svm/svm.c             |  8 ++++----
 arch/x86/kvm/vmx/vmx.c             |  8 ++++----
 arch/x86/kvm/x86.c                 | 14 +++++++-------
 7 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index e7bef91cee04..99b0c80db311 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -106,8 +106,8 @@ KVM_X86_OP_NULL(set_hv_timer)
 KVM_X86_OP_NULL(cancel_hv_timer)
 KVM_X86_OP(setup_mce)
 KVM_X86_OP(smi_allowed)
-KVM_X86_OP(pre_enter_smm)
-KVM_X86_OP(pre_leave_smm)
+KVM_X86_OP(enter_smm)
+KVM_X86_OP(leave_smm)
 KVM_X86_OP(enable_smi_window)
 KVM_X86_OP_NULL(mem_enc_op)
 KVM_X86_OP_NULL(mem_enc_reg_region)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9c7ced0e3171..de43070afdb6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1363,8 +1363,8 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*smi_allowed)(struct kvm_vcpu *vcpu, bool for_injection);
-	int (*pre_enter_smm)(struct kvm_vcpu *vcpu, char *smstate);
-	int (*pre_leave_smm)(struct kvm_vcpu *vcpu, const char *smstate);
+	int (*enter_smm)(struct kvm_vcpu *vcpu, char *smstate);
+	int (*leave_smm)(struct kvm_vcpu *vcpu, const char *smstate);
 	void (*enable_smi_window)(struct kvm_vcpu *vcpu);
 
 	int (*mem_enc_op)(struct kvm *kvm, void __user *argp);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 34c9f785d715..cb57ad5961d0 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2677,11 +2677,11 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 	}
 
 	/*
-	 * Give pre_leave_smm() a chance to make ISA-specific changes to the
-	 * vCPU state (e.g. enter guest mode) before loading state from the SMM
+	 * Give leave_smm() a chance to make ISA-specific changes to the vCPU
+	 * state (e.g. enter guest mode) before loading state from the SMM
 	 * state-save area.
 	 */
-	if (ctxt->ops->pre_leave_smm(ctxt, buf))
+	if (ctxt->ops->leave_smm(ctxt, buf))
 		goto emulate_shutdown;
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 31dc7ca4ff2b..3a2b6f5c7193 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -230,8 +230,7 @@ struct x86_emulate_ops {
 
 	unsigned (*get_hflags)(struct x86_emulate_ctxt *ctxt);
 	void (*exiting_smm)(struct x86_emulate_ctxt *ctxt);
-	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
-			     const char *smstate);
+	int (*leave_smm)(struct x86_emulate_ctxt *ctxt, const char *smstate);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
 };
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8c3918a11826..322a0baa4b37 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4249,7 +4249,7 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection)
 	return !svm_smi_blocked(vcpu);
 }
 
-static int svm_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
+static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	int ret;
@@ -4271,7 +4271,7 @@ static int svm_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 	return 0;
 }
 
-static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
+static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_host_map map;
@@ -4544,8 +4544,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.setup_mce = svm_setup_mce,
 
 	.smi_allowed = svm_smi_allowed,
-	.pre_enter_smm = svm_pre_enter_smm,
-	.pre_leave_smm = svm_pre_leave_smm,
+	.enter_smm = svm_enter_smm,
+	.leave_smm = svm_leave_smm,
 	.enable_smi_window = svm_enable_smi_window,
 
 	.mem_enc_op = svm_mem_enc_op,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 50b42d7a8a11..06e084ab16de 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7541,7 +7541,7 @@ static int vmx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection)
 	return !is_smm(vcpu);
 }
 
-static int vmx_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
+static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
@@ -7555,7 +7555,7 @@ static int vmx_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 	return 0;
 }
 
-static int vmx_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
+static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	int ret;
@@ -7730,8 +7730,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.setup_mce = vmx_setup_mce,
 
 	.smi_allowed = vmx_smi_allowed,
-	.pre_enter_smm = vmx_pre_enter_smm,
-	.pre_leave_smm = vmx_pre_leave_smm,
+	.enter_smm = vmx_enter_smm,
+	.leave_smm = vmx_leave_smm,
 	.enable_smi_window = vmx_enable_smi_window,
 
 	.can_emulate_instruction = vmx_can_emulate_instruction,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 347849caf1df..7710932cc12a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7106,10 +7106,10 @@ static void emulator_exiting_smm(struct x86_emulate_ctxt *ctxt)
 	kvm_smm_changed(vcpu, false);
 }
 
-static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
+static int emulator_leave_smm(struct x86_emulate_ctxt *ctxt,
 				  const char *smstate)
 {
-	return static_call(kvm_x86_pre_leave_smm)(emul_to_vcpu(ctxt), smstate);
+	return static_call(kvm_x86_leave_smm)(emul_to_vcpu(ctxt), smstate);
 }
 
 static void emulator_triple_fault(struct x86_emulate_ctxt *ctxt)
@@ -7164,7 +7164,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.set_nmi_mask        = emulator_set_nmi_mask,
 	.get_hflags          = emulator_get_hflags,
 	.exiting_smm         = emulator_exiting_smm,
-	.pre_leave_smm       = emulator_pre_leave_smm,
+	.leave_smm           = emulator_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
 };
@@ -8896,11 +8896,11 @@ static void enter_smm(struct kvm_vcpu *vcpu)
 		enter_smm_save_state_32(vcpu, buf);
 
 	/*
-	 * Give pre_enter_smm() a chance to make ISA-specific changes to the
-	 * vCPU state (e.g. leave guest mode) after we've saved the state into
-	 * the SMM state-save area.
+	 * Give enter_smm() a chance to make ISA-specific changes to the vCPU
+	 * state (e.g. leave guest mode) after we've saved the state into the
+	 * SMM state-save area.
 	 */
-	static_call(kvm_x86_pre_enter_smm)(vcpu, buf);
+	static_call(kvm_x86_enter_smm)(vcpu, buf);
 
 	kvm_smm_changed(vcpu, true);
 	kvm_vcpu_write_guest(vcpu, vcpu->arch.smbase + 0xfe00, buf, sizeof(buf));
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-09 18:56 ` [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails Sean Christopherson
@ 2021-06-10  8:26   ` Vitaly Kuznetsov
  2021-06-10 13:29     ` Paolo Bonzini
  2021-06-10 15:30     ` Sean Christopherson
  2021-06-10 13:23   ` Paolo Bonzini
  1 sibling, 2 replies; 18+ messages in thread
From: Vitaly Kuznetsov @ 2021-06-10  8:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	syzbot+fb0b6a7e8713aeb0319c, Paolo Bonzini

Sean Christopherson <seanjc@google.com> writes:

> Use the recently introduced KVM_REQ_TRIPLE_FAULT to properly emulate
> shutdown if RSM from SMM fails.
>
> Note, entering shutdown after clearing the SMM flag and restoring NMI
> blocking is architecturally correct with respect to AMD's APM, which KVM
> also uses for SMRAM layout and RSM NMI blocking behavior.  The APM says:
>
>   An RSM causes a processor shutdown if an invalid-state condition is
>   found in the SMRAM state-save area. Only an external reset, external
>   processor-initialization, or non-maskable external interrupt (NMI) can
>   cause the processor to leave the shutdown state.
>
> Of note is processor-initialization (INIT) as a valid shutdown wake
> event, as INIT is blocked by SMM, implying that entering shutdown also
> forces the CPU out of SMM.
>
> For recent Intel CPUs, restoring NMI blocking is technically wrong, but
> so is restoring NMI blocking in the first place, and Intel's RSM
> "architecture" is such a mess that just about anything is allowed and can
> be justified as micro-architectural behavior.
>
> Per the SDM:
>
>   On Pentium 4 and later processors, shutdown will inhibit INTR and A20M
>   but will not change any of the other inhibits. On these processors,
>   NMIs will be inhibited if no action is taken in the SMI handler to
>   uninhibit them (see Section 34.8).
>
> where Section 34.8 says:
>
>   When the processor enters SMM while executing an NMI handler, the
>   processor saves the SMRAM state save map but does not save the
>   attribute to keep NMI interrupts disabled. Potentially, an NMI could be
>   latched (while in SMM or upon exit) and serviced upon exit of SMM even
>   though the previous NMI handler has still not completed.
>
> I.e. RSM unconditionally unblocks NMI, but shutdown on RSM does not,
> which is in direct contradiction of KVM's behavior.  But, as mentioned
> above, KVM follows AMD architecture and restores NMI blocking on RSM, so
> that micro-architectural detail is already lost.
>
> And for Pentium era CPUs, SMI# can break shutdown, meaning that at least
> some Intel CPUs fully leave SMM when entering shutdown:
>
>   In the shutdown state, Intel processors stop executing instructions
>   until a RESET#, INIT# or NMI# is asserted.  While Pentium family
>   processors recognize the SMI# signal in shutdown state, P6 family and
>   Intel486 processors do not.
>
> In other words, the fact that Intel CPUs have implemented the two
> extremes gives KVM carte blanche when it comes to honoring Intel's
> architecture for handling shutdown during RSM.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/emulate.c     | 12 +++++++-----
>  arch/x86/kvm/kvm_emulate.h |  1 +
>  arch/x86/kvm/x86.c         |  6 ++++++
>  3 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 5e5de05a8fbf..0603a2c79093 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -2683,7 +2683,7 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
>  	 * state-save area.
>  	 */
>  	if (ctxt->ops->pre_leave_smm(ctxt, buf))
> -		return X86EMUL_UNHANDLEABLE;
> +		goto emulate_shutdown;
>  
>  #ifdef CONFIG_X86_64
>  	if (emulator_has_longmode(ctxt))
> @@ -2692,14 +2692,16 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
>  #endif
>  		ret = rsm_load_state_32(ctxt, buf);
>  
> -	if (ret != X86EMUL_CONTINUE) {
> -		/* FIXME: should triple fault */
> -		return X86EMUL_UNHANDLEABLE;
> -	}
> +	if (ret != X86EMUL_CONTINUE)
> +		goto emulate_shutdown;
>  
>  	ctxt->ops->post_leave_smm(ctxt);
>  
>  	return X86EMUL_CONTINUE;
> +
> +emulate_shutdown:
> +	ctxt->ops->triple_fault(ctxt);
> +	return X86EMUL_UNHANDLEABLE;

I'm probably missing something, but what's the desired effect of both
raising KVM_REQ_TRIPLE_FAULT and returning X86EMUL_UNHANDLEABLE here?

I've modified smm selftest to see what's happening:

diff --git a/tools/testing/selftests/kvm/x86_64/smm_test.c b/tools/testing/selftests/kvm/x86_64/smm_test.c
index 613c42c5a9b8..cf215cd2c6e2 100644
--- a/tools/testing/selftests/kvm/x86_64/smm_test.c
+++ b/tools/testing/selftests/kvm/x86_64/smm_test.c
@@ -147,6 +147,11 @@ int main(int argc, char *argv[])
                            "Unexpected stage: #%x, got %x",
                            stage, stage_reported);
 
+               if (stage_reported == SMRAM_STAGE) {
+                       /* corrupt smram */
+                       memset(addr_gpa2hva(vm, SMRAM_GPA) + 0xfe00, 0xff, 512);
+               }
+
                state = vcpu_save_state(vm, VCPU_ID);
                kvm_vm_release(vm);
                kvm_vm_restart(vm, O_RDWR);

What I see is:

        smm_test-7600  [002]  4497.073918: kvm_exit:             reason EXIT_RSM rip 0x8004 info 0 0
        smm_test-7600  [002]  4497.073921: kvm_emulate_insn:     1000000:8004: 0f aa
        smm_test-7600  [002]  4497.073924: kvm_smm_transition:   vcpu 1: leaving SMM, smbase 0x1000000
        smm_test-7600  [002]  4497.073928: kvm_emulate_insn:     0:8004: 0f aa FAIL
        smm_test-7600  [002]  4497.073929: kvm_fpu:              unload
        smm_test-7600  [002]  4497.073930: kvm_userspace_exit:   reason KVM_EXIT_INTERNAL_ERROR (17)

If I change X86EMUL_UNHANDLEABLE to X86EMUL_CONTINUE tripple fault is
happening indeed (why don't we have triple fault printed in trace by
default BTW???):

        smm_test-16810 [006]  5117.007220: kvm_exit:             reason EXIT_RSM rip 0x8004 info 0 0
        smm_test-16810 [006]  5117.007222: kvm_emulate_insn:     1000000:8004: 0f aa
        smm_test-16810 [006]  5117.007225: kvm_smm_transition:   vcpu 1: leaving SMM, smbase 0x1000000
        smm_test-16810 [006]  5117.007229: bputs:                vcpu_enter_guest: KVM_REQ_TRIPLE_FAULT
        smm_test-16810 [006]  5117.007230: kvm_fpu:              unload
        smm_test-16810 [006]  5117.007230: kvm_userspace_exit:   reason KVM_EXIT_SHUTDOWN (8)

So should we actually have X86EMUL_CONTINUE when we queue
KVM_REQ_TRIPLE_FAULT here?

(Initially, my comment was supposed to be 'why don't you add
TRIPLE_FAULT to smm selftest?' but the above overshadows it)

>  }
>  
>  static void
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 3e870bf9ca4d..9c34aa60e45f 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -233,6 +233,7 @@ struct x86_emulate_ops {
>  	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
>  			     const char *smstate);
>  	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
> +	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
>  	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
>  };
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 54d212fe9b15..cda148cf06fa 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7123,6 +7123,11 @@ static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
>  	kvm_smm_changed(emul_to_vcpu(ctxt));
>  }
>  
> +static void emulator_triple_fault(struct x86_emulate_ctxt *ctxt)
> +{
> +	kvm_make_request(KVM_REQ_TRIPLE_FAULT, emul_to_vcpu(ctxt));
> +}
> +
>  static int emulator_set_xcr(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr)
>  {
>  	return __kvm_set_xcr(emul_to_vcpu(ctxt), index, xcr);
> @@ -7172,6 +7177,7 @@ static const struct x86_emulate_ops emulate_ops = {
>  	.set_hflags          = emulator_set_hflags,
>  	.pre_leave_smm       = emulator_pre_leave_smm,
>  	.post_leave_smm      = emulator_post_leave_smm,
> +	.triple_fault        = emulator_triple_fault,
>  	.set_xcr             = emulator_set_xcr,
>  };

-- 
Vitaly


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared
  2021-06-09 18:56 ` [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Sean Christopherson
@ 2021-06-10 13:19   ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-06-10 13:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, syzbot+fb0b6a7e8713aeb0319c

On 09/06/21 20:56, Sean Christopherson wrote:
> Immediately reset the MMU context when the vCPU's SMM flag is cleared so
> that the SMM flag in the MMU role is always synchronized with the vCPU's
> flag.  If RSM fails (which isn't correctly emulated), KVM will bail
> without calling post_leave_smm() and leave the MMU in a bad state.
> 
> The bad MMU role can lead to a NULL pointer dereference when grabbing a
> shadow page's rmap for a page fault as the initial lookups for the gfn
> will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will
> use the shadow page's SMM flag, which comes from the MMU (=1).  SMM has
> an entirely different set of memslots, and so the initial lookup can find
> a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1).
> 
>    general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
>    KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>    CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0
>    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>    RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline]
>    RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947
>    Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44
>    RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246
>    RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40
>    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
>    RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002
>    R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000
>    R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000
>    FS:  000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0
>    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>    Call Trace:
>     rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline]
>     mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604
>     __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline]
>     direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769
>     kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline]
>     kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065
>     vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122
>     vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428
>     vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494
>     kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722
>     kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460
>     vfs_ioctl fs/ioctl.c:51 [inline]
>     __do_sys_ioctl fs/ioctl.c:1069 [inline]
>     __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055
>     do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
>     entry_SYSCALL_64_after_hwframe+0x44/0xae
>    RIP: 0033:0x440ce9
> 
> Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com
> Fixes: 9ec19493fb86 ("KVM: x86: clear SMM flags before loading state while leaving SMM")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/x86.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9dd23bdfc6cc..54d212fe9b15 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7106,7 +7106,10 @@ static unsigned emulator_get_hflags(struct x86_emulate_ctxt *ctxt)
>   
>   static void emulator_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned emul_flags)
>   {
> -	emul_to_vcpu(ctxt)->arch.hflags = emul_flags;
> +	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> +
> +	vcpu->arch.hflags = emul_flags;
> +	kvm_mmu_reset_context(vcpu);
>   }
>   
>   static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
> 

Queued for kvm/master, thanks.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-09 18:56 ` [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails Sean Christopherson
  2021-06-10  8:26   ` Vitaly Kuznetsov
@ 2021-06-10 13:23   ` Paolo Bonzini
  1 sibling, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-06-10 13:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, syzbot+fb0b6a7e8713aeb0319c

On 09/06/21 20:56, Sean Christopherson wrote:
> For recent Intel CPUs, restoring NMI blocking is technically wrong, but
> so is restoring NMI blocking in the first place, and Intel's RSM
> "architecture" is such a mess that just about anything is allowed and can
> be justified as micro-architectural behavior.

The Intel manual is an absolute mess with respect to NMI blocking, and 
for once AMD followed suit.

Some versions of the AMD BIOS and Kernel Developer Manual provide the 
offset of the "NMI masked" flag in the SMM state save area, but 
unfortunately that was discovered too late.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug
  2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
                   ` (8 preceding siblings ...)
  2021-06-09 18:56 ` [PATCH 9/9] KVM: x86: Drop "pre_" from enter/leave_smm() helpers Sean Christopherson
@ 2021-06-10 13:28 ` Paolo Bonzini
  9 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-06-10 13:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, syzbot+fb0b6a7e8713aeb0319c

On 09/06/21 20:56, Sean Christopherson wrote:
> Fix a NULL pointer dereference in gfn_to_rmap() that occurs if RSM fails,
> reported by syzbot.
> 
> The immediate problem is that the MMU context's role gets out of sync
> because KVM clears the SMM flag in the vCPU at the start of RSM emulation,
> but only resets the MMU context if RSM succeeds.  The divergence in vCPU
> vs. MMU role with respect to the SMM flag causes explosions if the non-SMM
> memslots have gfn ranges that are not present in the SMM memslots, because
> the MMU expects that the memslot for a shadow page cannot magically
> disappear.
> 
> The other obvious problem is that KVM doesn't emulate triple fault on RSM
> failure, e.g. it keeps running the vCPU in a frankenstate instead of
> exiting to userspace.  Fixing that would squash the syzbot repro, but
> would not fix the underlying issue because nothing prevents userspace from
> calling KVM_RUN on a vCPU that hit shutdown (yay lack of a shutdown state).
> But, it's easy to fix and definitely worth doing.
> 
> Everything after the two bug fixes is cleanup.
> 
> Ben Gardon has an internal patch or two that guards against the NULL
> pointer dereference in gfn_to_rmap().  I'm planning on getting that
> functionality posted (needs a little massaging) so that these types of
> snafus don't crash the host (this isn't the first time I've introduced a
> bug that broke gfn_to_rmap(), though thankfully it's the first time such
> a bug has made it upstream, knock on wood).
> 
> Amusingly, adding gfn_to_rmap() NULL memslot checks might even be a
> performance improvement.  Because gfn_to_rmap() doesn't check the memslot
> before using it, and because the compiler can see the search_memslots()
> returns NULL/0, gcc often/always generates dedicated (and hilarious) code
> for NULL, e.g. this #GP was caused by an explicit load from 0:
> 
>    48 8b 14 25 00 00 00 00	mov    0x0,%rdx
> 
> 
> Sean Christopherson (9):
>    KVM: x86: Immediately reset the MMU context when the SMM flag is
>      cleared
>    KVM: x86: Emulate triple fault shutdown if RSM emulation fails
>    KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper
>    KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag
>    KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed()
>    KVM: x86: Move "entering SMM" tracepoint into kvm_smm_changed()
>    KVM: x86: Rename SMM tracepoint to make it reflect reality
>    KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset
>    KVM: x86: Drop "pre_" from enter/leave_smm() helpers
> 
>   arch/x86/include/asm/kvm-x86-ops.h |  4 +--
>   arch/x86/include/asm/kvm_host.h    |  4 +--
>   arch/x86/kvm/emulate.c             | 31 ++++++++++-------
>   arch/x86/kvm/kvm_emulate.h         |  7 ++--
>   arch/x86/kvm/svm/svm.c             |  8 ++---
>   arch/x86/kvm/trace.h               |  2 +-
>   arch/x86/kvm/vmx/vmx.c             |  8 ++---
>   arch/x86/kvm/x86.c                 | 53 +++++++++++++++---------------
>   8 files changed, 61 insertions(+), 56 deletions(-)
> 

Queued 2-9 too for 5.14, with Vitaly's suggested change for patch 2.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-10  8:26   ` Vitaly Kuznetsov
@ 2021-06-10 13:29     ` Paolo Bonzini
  2021-06-10 15:28       ` Sean Christopherson
  2021-06-10 15:30     ` Sean Christopherson
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2021-06-10 13:29 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	syzbot+fb0b6a7e8713aeb0319c

On 10/06/21 10:26, Vitaly Kuznetsov wrote:
> So should we actually have X86EMUL_CONTINUE when we queue
> KVM_REQ_TRIPLE_FAULT here?

Yes...

> (Initially, my comment was supposed to be 'why don't you add
> TRIPLE_FAULT to smm selftest?' but the above overshadows it)

... and a tenth patch to add a selftest would be nice to have indeed.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-10 13:29     ` Paolo Bonzini
@ 2021-06-10 15:28       ` Sean Christopherson
  0 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-06-10 15:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, syzbot+fb0b6a7e8713aeb0319c

On Thu, Jun 10, 2021, Paolo Bonzini wrote:
> On 10/06/21 10:26, Vitaly Kuznetsov wrote:
> > So should we actually have X86EMUL_CONTINUE when we queue
> > KVM_REQ_TRIPLE_FAULT here?
> 
> Yes...
> 
> > (Initially, my comment was supposed to be 'why don't you add
> > TRIPLE_FAULT to smm selftest?' but the above overshadows it)
> 
> ... and a tenth patch to add a selftest would be nice to have indeed.

Yes, I've been remiss in writing/modifying tests, I'll prioritize that once I've
dug myself out of the rabbit holes I wandered into via the vCPU RESET/INIT series.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-10  8:26   ` Vitaly Kuznetsov
  2021-06-10 13:29     ` Paolo Bonzini
@ 2021-06-10 15:30     ` Sean Christopherson
  2021-06-11 11:42       ` Vitaly Kuznetsov
  1 sibling, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2021-06-10 15:30 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	syzbot+fb0b6a7e8713aeb0319c, Paolo Bonzini

On Thu, Jun 10, 2021, Vitaly Kuznetsov wrote:
> why don't we have triple fault printed in trace by default BTW???

Or maybe trace_kvm_make_request()?  Not sure how much pain that would inflict on
the binaries though.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails
  2021-06-10 15:30     ` Sean Christopherson
@ 2021-06-11 11:42       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 18+ messages in thread
From: Vitaly Kuznetsov @ 2021-06-11 11:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	syzbot+fb0b6a7e8713aeb0319c, Paolo Bonzini

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Jun 10, 2021, Vitaly Kuznetsov wrote:
>> why don't we have triple fault printed in trace by default BTW???
>
> Or maybe trace_kvm_make_request()?

... or maybe just trace_kvm_check_request()? 

$ git grep  'kvm_make_request(' arch/x86/kvm/ | wc -l
115
$ git grep  'kvm_check_request(' arch/x86/kvm/ | wc -l
41

-- 
Vitaly


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-06-11 11:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09 18:56 [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Sean Christopherson
2021-06-09 18:56 ` [PATCH 1/9] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Sean Christopherson
2021-06-10 13:19   ` Paolo Bonzini
2021-06-09 18:56 ` [PATCH 2/9] KVM: x86: Emulate triple fault shutdown if RSM emulation fails Sean Christopherson
2021-06-10  8:26   ` Vitaly Kuznetsov
2021-06-10 13:29     ` Paolo Bonzini
2021-06-10 15:28       ` Sean Christopherson
2021-06-10 15:30     ` Sean Christopherson
2021-06-11 11:42       ` Vitaly Kuznetsov
2021-06-10 13:23   ` Paolo Bonzini
2021-06-09 18:56 ` [PATCH 3/9] KVM: x86: Replace .set_hflags() with dedicated .exiting_smm() helper Sean Christopherson
2021-06-09 18:56 ` [PATCH 4/9] KVM: x86: Invoke kvm_smm_changed() immediately after clearing SMM flag Sean Christopherson
2021-06-09 18:56 ` [PATCH 5/9] KVM: x86: Move (most) SMM hflags modifications into kvm_smm_changed() Sean Christopherson
2021-06-09 18:56 ` [PATCH 6/9] KVM: x86: Move "entering SMM" tracepoint " Sean Christopherson
2021-06-09 18:56 ` [PATCH 7/9] KVM: x86: Rename SMM tracepoint to make it reflect reality Sean Christopherson
2021-06-09 18:56 ` [PATCH 8/9] KVM: x86: Drop .post_leave_smm(), i.e. the manual post-RSM MMU reset Sean Christopherson
2021-06-09 18:56 ` [PATCH 9/9] KVM: x86: Drop "pre_" from enter/leave_smm() helpers Sean Christopherson
2021-06-10 13:28 ` [PATCH 0/9] KVM: x86: Fix NULL pointer #GP due to RSM bug Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).