kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte()
@ 2020-12-18  0:31 Sean Christopherson
  2020-12-18  0:31 ` [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte " Sean Christopherson
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Sean Christopherson @ 2020-12-18  0:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon, Richard Herbert

Two fixes for bugs that were introduced along with the TDP MMU (though I
strongly suspect only the one reported by Richard, fixed in patch 2/4, is
hittable in practice).  Two additional cleanup on top to try and make the
code a bit more readable and shave a few cycles.

Sean Christopherson (4):
  KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
  KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
  KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()

 arch/x86/kvm/mmu/mmu.c     | 53 +++++++++++++++++++++-----------------
 arch/x86/kvm/mmu/tdp_mmu.c |  9 ++++---
 arch/x86/kvm/mmu/tdp_mmu.h |  4 ++-
 3 files changed, 39 insertions(+), 27 deletions(-)

-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
  2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
@ 2020-12-18  0:31 ` Sean Christopherson
  2020-12-18  8:58   ` Vitaly Kuznetsov
  2020-12-18  0:31 ` [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE Sean Christopherson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2020-12-18  0:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon, Richard Herbert

Return -1 from the get_walk() helpers if the shadow walk doesn't fill at
least one spte, which can theoretically happen if the walk hits a
not-present PTPDR.  Returning the root level in such a case will cause
get_mmio_spte() to return garbage (uninitialized stack data).  In
practice, such a scenario should be impossible as KVM shouldn't get a
reserved-bit page fault with a not-present PDPTR.

Note, using mmu->root_level in get_walk() is wrong for other reasons,
too, but that's now a moot point.

Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Cc: Ben Gardon <bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c     | 7 ++++++-
 arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7a6ae9e90bd7..a48cd12c01d7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3488,7 +3488,7 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
 {
 	struct kvm_shadow_walk_iterator iterator;
-	int leaf = vcpu->arch.mmu->root_level;
+	int leaf = -1;
 	u64 spte;
 
 
@@ -3532,6 +3532,11 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	else
 		leaf = get_walk(vcpu, addr, sptes);
 
+	if (unlikely(leaf < 0)) {
+		*sptep = 0ull;
+		return reserved;
+	}
+
 	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
 
 	for (level = root; level >= leaf; level--) {
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 84c8f06bec26..50cec7a15ddb 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1152,8 +1152,8 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
 {
 	struct tdp_iter iter;
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
-	int leaf = vcpu->arch.mmu->shadow_root_level;
 	gfn_t gfn = addr >> PAGE_SHIFT;
+	int leaf = -1;
 
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
 		leaf = iter.level;
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
  2020-12-18  0:31 ` [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte " Sean Christopherson
@ 2020-12-18  0:31 ` Sean Christopherson
  2020-12-18  9:10   ` Vitaly Kuznetsov
  2020-12-18  0:31 ` [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array Sean Christopherson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2020-12-18  0:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon, Richard Herbert

Get the so called "root" level from the low level shadow page table
walkers instead of manually attempting to calculate it higher up the
stack, e.g. in get_mmio_spte().  When KVM is using PAE shadow paging,
the starting level of the walk, from the callers perspective, is not
the CR3 root but rather the PDPTR "root".  Checking for reserved bits
from the CR3 root causes get_mmio_spte() to consume uninitialized stack
data due to indexing into sptes[] for a level that was not filled by
get_walk().  This can result in false positives and/or negatives
depending on what garbage happens to be on the stack.

Opportunistically nuke a few extra newlines.

Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Reported-by: Richard Herbert <rherbert@sympatico.ca>
Cc: Ben Gardon <bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c     | 15 ++++++---------
 arch/x86/kvm/mmu/tdp_mmu.c |  5 ++++-
 arch/x86/kvm/mmu/tdp_mmu.h |  4 +++-
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a48cd12c01d7..52f36c879086 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3485,16 +3485,16 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
  * Return the level of the lowest level SPTE added to sptes.
  * That SPTE may be non-present.
  */
-static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
+static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	int leaf = -1;
 	u64 spte;
 
-
 	walk_shadow_page_lockless_begin(vcpu);
 
-	for (shadow_walk_init(&iterator, vcpu, addr);
+	for (shadow_walk_init(&iterator, vcpu, addr),
+	     *root_level = iterator.level;
 	     shadow_walk_okay(&iterator);
 	     __shadow_walk_next(&iterator, spte)) {
 		leaf = iterator.level;
@@ -3504,7 +3504,6 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
 
 		if (!is_shadow_present_pte(spte))
 			break;
-
 	}
 
 	walk_shadow_page_lockless_end(vcpu);
@@ -3517,9 +3516,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
 	u64 sptes[PT64_ROOT_MAX_LEVEL];
 	struct rsvd_bits_validate *rsvd_check;
-	int root = vcpu->arch.mmu->shadow_root_level;
-	int leaf;
-	int level;
+	int root, leaf, level;
 	bool reserved = false;
 
 	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
@@ -3528,9 +3525,9 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	}
 
 	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
-		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes);
+		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
 	else
-		leaf = get_walk(vcpu, addr, sptes);
+		leaf = get_walk(vcpu, addr, sptes, &root);
 
 	if (unlikely(leaf < 0)) {
 		*sptep = 0ull;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 50cec7a15ddb..a4f9447f8327 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1148,13 +1148,16 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
  * Return the level of the lowest level SPTE added to sptes.
  * That SPTE may be non-present.
  */
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
+int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+			 int *root_level)
 {
 	struct tdp_iter iter;
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	gfn_t gfn = addr >> PAGE_SHIFT;
 	int leaf = -1;
 
+	*root_level = vcpu->arch.mmu->shadow_root_level;
+
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
 		leaf = iter.level;
 		sptes[leaf - 1] = iter.old_spte;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 556e065503f6..cbbdbadd1526 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -44,5 +44,7 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
 bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
 				   struct kvm_memory_slot *slot, gfn_t gfn);
 
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes);
+int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+			 int *root_level);
+
 #endif /* __KVM_X86_MMU_TDP_MMU_H */
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
  2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
  2020-12-18  0:31 ` [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte " Sean Christopherson
  2020-12-18  0:31 ` [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE Sean Christopherson
@ 2020-12-18  0:31 ` Sean Christopherson
  2020-12-18  9:18   ` Vitaly Kuznetsov
  2020-12-18  0:31 ` [PATCH 4/4] KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte() Sean Christopherson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2020-12-18  0:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon, Richard Herbert

Bump the size of the sptes array by one and use the raw level of the
SPTE to index into the sptes array.  Using the SPTE level directly
improves readability by eliminating the need to reason out why the level
is being adjusted when indexing the array.  The array is on the stack
and is not explicitly initialized; bumping its size is nothing more than
a superficial adjustment to the stack frame.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c     | 15 +++++++--------
 arch/x86/kvm/mmu/tdp_mmu.c |  2 +-
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 52f36c879086..4798a4472066 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3500,7 +3500,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
 		leaf = iterator.level;
 		spte = mmu_spte_get_lockless(iterator.sptep);
 
-		sptes[leaf - 1] = spte;
+		sptes[leaf] = spte;
 
 		if (!is_shadow_present_pte(spte))
 			break;
@@ -3514,7 +3514,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
 /* return true if reserved bit is detected on spte. */
 static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
-	u64 sptes[PT64_ROOT_MAX_LEVEL];
+	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
 	struct rsvd_bits_validate *rsvd_check;
 	int root, leaf, level;
 	bool reserved = false;
@@ -3537,16 +3537,15 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
 
 	for (level = root; level >= leaf; level--) {
-		if (!is_shadow_present_pte(sptes[level - 1]))
+		if (!is_shadow_present_pte(sptes[level]))
 			break;
 		/*
 		 * Use a bitwise-OR instead of a logical-OR to aggregate the
 		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
 		 * adding a Jcc in the loop.
 		 */
-		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level - 1]) |
-			    __is_rsvd_bits_set(rsvd_check, sptes[level - 1],
-					       level);
+		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
+			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
 	}
 
 	if (reserved) {
@@ -3554,10 +3553,10 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 		       __func__, addr);
 		for (level = root; level >= leaf; level--)
 			pr_err("------ spte 0x%llx level %d.\n",
-			       sptes[level - 1], level);
+			       sptes[level], level);
 	}
 
-	*sptep = sptes[leaf - 1];
+	*sptep = sptes[leaf];
 
 	return reserved;
 }
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index a4f9447f8327..efef571806ad 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1160,7 +1160,7 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
 		leaf = iter.level;
-		sptes[leaf - 1] = iter.old_spte;
+		sptes[leaf] = iter.old_spte;
 	}
 
 	return leaf;
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
  2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-12-18  0:31 ` [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array Sean Christopherson
@ 2020-12-18  0:31 ` Sean Christopherson
  2020-12-18  9:33   ` Vitaly Kuznetsov
       [not found] ` <2346556.XAFRqVoOGU@starbug.dom>
  2020-12-21 18:26 ` Paolo Bonzini
  5 siblings, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2020-12-18  0:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon, Richard Herbert

Check only the terminal leaf for a "!PRESENT || MMIO" SPTE when looking
for reserved bits on valid, non-MMIO SPTEs.  The get_walk() helpers
terminate their walks if a not-present or MMIO SPTE is encountered, i.e.
the non-terminal SPTEs have already been verified to be regular SPTEs.
This eliminates an extra check-and-branch in a relatively hot loop.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4798a4472066..769855f5f0a1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3511,7 +3511,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
 	return leaf;
 }
 
-/* return true if reserved bit is detected on spte. */
+/* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
 static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
 	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
@@ -3534,11 +3534,20 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 		return reserved;
 	}
 
+	*sptep = sptes[leaf];
+
+	/*
+	 * Skip reserved bits checks on the terminal leaf if it's not a valid
+	 * SPTE.  Note, this also (intentionally) skips MMIO SPTEs, which, by
+	 * design, always have reserved bits set.  The purpose of the checks is
+	 * to detect reserved bits on non-MMIO SPTEs. i.e. buggy SPTEs.
+	 */
+	if (!is_shadow_present_pte(sptes[leaf]))
+		leaf++;
+
 	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
 
-	for (level = root; level >= leaf; level--) {
-		if (!is_shadow_present_pte(sptes[level]))
-			break;
+	for (level = root; level >= leaf; level--)
 		/*
 		 * Use a bitwise-OR instead of a logical-OR to aggregate the
 		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
@@ -3546,7 +3555,6 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 		 */
 		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
 			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
-	}
 
 	if (reserved) {
 		pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump hierarchy:\n",
@@ -3556,8 +3564,6 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 			       sptes[level], level);
 	}
 
-	*sptep = sptes[leaf];
-
 	return reserved;
 }
 
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte()
       [not found] ` <2346556.XAFRqVoOGU@starbug.dom>
@ 2020-12-18  1:27   ` Richard Herbert
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Herbert @ 2020-12-18  1:27 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon

 Hi, Sean and all.

Thanks so much for these.  Very glad to report that the problem has been 
solved.  I applied all four patches, recompiled kernel 5.10.1 and successfully 
launched a Qemu VM.  Let's hope these will get merged into 5.10.2.

Thanks again for the hard work and quick fix.

Richard Herbert


On Thursday, December 17, 2020 7:31:35 PM EST Sean Christopherson wrote:

> Two fixes for bugs that were introduced along with the TDP MMU (though I
> strongly suspect only the one reported by Richard, fixed in patch 2/4, is
> hittable in practice).  Two additional cleanup on top to try and make the
> code a bit more readable and shave a few cycles.
> 
> Sean Christopherson (4):
>   KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
>   KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
>   KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
>   KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
> 
>  arch/x86/kvm/mmu/mmu.c     | 53 +++++++++++++++++++++-----------------
>  arch/x86/kvm/mmu/tdp_mmu.c |  9 ++++---
>  arch/x86/kvm/mmu/tdp_mmu.h |  4 ++-
>  3 files changed, 39 insertions(+), 27 deletions(-)









^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
  2020-12-18  0:31 ` [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte " Sean Christopherson
@ 2020-12-18  8:58   ` Vitaly Kuznetsov
  2020-12-21 17:05     ` Sean Christopherson
  0 siblings, 1 reply; 14+ messages in thread
From: Vitaly Kuznetsov @ 2020-12-18  8:58 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

Sean Christopherson <seanjc@google.com> writes:

> Return -1 from the get_walk() helpers if the shadow walk doesn't fill at
> least one spte, which can theoretically happen if the walk hits a
> not-present PTPDR.  Returning the root level in such a case will cause

PDPTR

> get_mmio_spte() to return garbage (uninitialized stack data).  In
> practice, such a scenario should be impossible as KVM shouldn't get a
> reserved-bit page fault with a not-present PDPTR.
>
> Note, using mmu->root_level in get_walk() is wrong for other reasons,
> too, but that's now a moot point.
>
> Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
> Cc: Ben Gardon <bgardon@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c     | 7 ++++++-
>  arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7a6ae9e90bd7..a48cd12c01d7 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3488,7 +3488,7 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>  static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
>  {
>  	struct kvm_shadow_walk_iterator iterator;
> -	int leaf = vcpu->arch.mmu->root_level;
> +	int leaf = -1;
>  	u64 spte;
>  
>  
> @@ -3532,6 +3532,11 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  	else
>  		leaf = get_walk(vcpu, addr, sptes);
>  
> +	if (unlikely(leaf < 0)) {
> +		*sptep = 0ull;
> +		return reserved;
> +	}

When SPTE=0 is returned from get_mmio_spte(), handle_mmio_page_fault()
will return RET_PF_RETRY -- should it be RET_PF_INVALID instead?

> +
>  	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
>  
>  	for (level = root; level >= leaf; level--) {
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 84c8f06bec26..50cec7a15ddb 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1152,8 +1152,8 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
>  {
>  	struct tdp_iter iter;
>  	struct kvm_mmu *mmu = vcpu->arch.mmu;
> -	int leaf = vcpu->arch.mmu->shadow_root_level;
>  	gfn_t gfn = addr >> PAGE_SHIFT;
> +	int leaf = -1;
>  
>  	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
>  		leaf = iter.level;

-- 
Vitaly


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  2020-12-18  0:31 ` [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE Sean Christopherson
@ 2020-12-18  9:10   ` Vitaly Kuznetsov
  2020-12-21 18:24     ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Vitaly Kuznetsov @ 2020-12-18  9:10 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

Sean Christopherson <seanjc@google.com> writes:

> Get the so called "root" level from the low level shadow page table
> walkers instead of manually attempting to calculate it higher up the
> stack, e.g. in get_mmio_spte().  When KVM is using PAE shadow paging,
> the starting level of the walk, from the callers perspective, is not
> the CR3 root but rather the PDPTR "root".  Checking for reserved bits
> from the CR3 root causes get_mmio_spte() to consume uninitialized stack
> data due to indexing into sptes[] for a level that was not filled by
> get_walk().  This can result in false positives and/or negatives
> depending on what garbage happens to be on the stack.
>
> Opportunistically nuke a few extra newlines.
>
> Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
> Reported-by: Richard Herbert <rherbert@sympatico.ca>
> Cc: Ben Gardon <bgardon@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c     | 15 ++++++---------
>  arch/x86/kvm/mmu/tdp_mmu.c |  5 ++++-
>  arch/x86/kvm/mmu/tdp_mmu.h |  4 +++-
>  3 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index a48cd12c01d7..52f36c879086 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3485,16 +3485,16 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>   * Return the level of the lowest level SPTE added to sptes.
>   * That SPTE may be non-present.
>   */
> -static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
> +static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
>  {
>  	struct kvm_shadow_walk_iterator iterator;
>  	int leaf = -1;
>  	u64 spte;
>  
> -
>  	walk_shadow_page_lockless_begin(vcpu);
>  
> -	for (shadow_walk_init(&iterator, vcpu, addr);
> +	for (shadow_walk_init(&iterator, vcpu, addr),
> +	     *root_level = iterator.level;
>  	     shadow_walk_okay(&iterator);
>  	     __shadow_walk_next(&iterator, spte)) {
>  		leaf = iterator.level;
> @@ -3504,7 +3504,6 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
>  
>  		if (!is_shadow_present_pte(spte))
>  			break;
> -
>  	}
>  
>  	walk_shadow_page_lockless_end(vcpu);
> @@ -3517,9 +3516,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  {
>  	u64 sptes[PT64_ROOT_MAX_LEVEL];
>  	struct rsvd_bits_validate *rsvd_check;
> -	int root = vcpu->arch.mmu->shadow_root_level;
> -	int leaf;
> -	int level;
> +	int root, leaf, level;
>  	bool reserved = false;

Personal taste: I would've renamed 'root' to 'root_level' (to be
consistent with get_walk()/kvm_tdp_mmu_get_walk()) and 'level' to
e.g. 'l' as it's only being used as an interator ('i' would also do).

>  
>  	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
> @@ -3528,9 +3525,9 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  	}
>  
>  	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
> -		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes);
> +		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
>  	else
> -		leaf = get_walk(vcpu, addr, sptes);
> +		leaf = get_walk(vcpu, addr, sptes, &root);
>  
>  	if (unlikely(leaf < 0)) {
>  		*sptep = 0ull;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 50cec7a15ddb..a4f9447f8327 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1148,13 +1148,16 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
>   * Return the level of the lowest level SPTE added to sptes.
>   * That SPTE may be non-present.
>   */
> -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
> +int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> +			 int *root_level)
>  {
>  	struct tdp_iter iter;
>  	struct kvm_mmu *mmu = vcpu->arch.mmu;
>  	gfn_t gfn = addr >> PAGE_SHIFT;
>  	int leaf = -1;
>  
> +	*root_level = vcpu->arch.mmu->shadow_root_level;
> +
>  	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
>  		leaf = iter.level;
>  		sptes[leaf - 1] = iter.old_spte;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
> index 556e065503f6..cbbdbadd1526 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.h
> +++ b/arch/x86/kvm/mmu/tdp_mmu.h
> @@ -44,5 +44,7 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
>  bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
>  				   struct kvm_memory_slot *slot, gfn_t gfn);
>  
> -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes);
> +int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> +			 int *root_level);
> +
>  #endif /* __KVM_X86_MMU_TDP_MMU_H */

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
  2020-12-18  0:31 ` [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array Sean Christopherson
@ 2020-12-18  9:18   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2020-12-18  9:18 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

Sean Christopherson <seanjc@google.com> writes:

> Bump the size of the sptes array by one and use the raw level of the
> SPTE to index into the sptes array.  Using the SPTE level directly
> improves readability by eliminating the need to reason out why the level
> is being adjusted when indexing the array.  The array is on the stack
> and is not explicitly initialized; bumping its size is nothing more than
> a superficial adjustment to the stack frame.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c     | 15 +++++++--------
>  arch/x86/kvm/mmu/tdp_mmu.c |  2 +-
>  2 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 52f36c879086..4798a4472066 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3500,7 +3500,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
>  		leaf = iterator.level;
>  		spte = mmu_spte_get_lockless(iterator.sptep);
>  
> -		sptes[leaf - 1] = spte;
> +		sptes[leaf] = spte;
>  
>  		if (!is_shadow_present_pte(spte))
>  			break;
> @@ -3514,7 +3514,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
>  /* return true if reserved bit is detected on spte. */
>  static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  {
> -	u64 sptes[PT64_ROOT_MAX_LEVEL];
> +	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
>  	struct rsvd_bits_validate *rsvd_check;
>  	int root, leaf, level;
>  	bool reserved = false;
> @@ -3537,16 +3537,15 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
>  
>  	for (level = root; level >= leaf; level--) {
> -		if (!is_shadow_present_pte(sptes[level - 1]))
> +		if (!is_shadow_present_pte(sptes[level]))
>  			break;
>  		/*
>  		 * Use a bitwise-OR instead of a logical-OR to aggregate the
>  		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
>  		 * adding a Jcc in the loop.
>  		 */
> -		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level - 1]) |
> -			    __is_rsvd_bits_set(rsvd_check, sptes[level - 1],
> -					       level);
> +		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
> +			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
>  	}
>  
>  	if (reserved) {
> @@ -3554,10 +3553,10 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  		       __func__, addr);
>  		for (level = root; level >= leaf; level--)
>  			pr_err("------ spte 0x%llx level %d.\n",
> -			       sptes[level - 1], level);
> +			       sptes[level], level);
>  	}
>  
> -	*sptep = sptes[leaf - 1];
> +	*sptep = sptes[leaf];
>  
>  	return reserved;
>  }
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index a4f9447f8327..efef571806ad 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1160,7 +1160,7 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
>  
>  	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
>  		leaf = iter.level;
> -		sptes[leaf - 1] = iter.old_spte;
> +		sptes[leaf] = iter.old_spte;
>  	}
>  
>  	return leaf;

An alretnitive solution would've been to reverse the array and fill it
like

 sptes[PT64_ROOT_MAX_LEVEL - leaf] = iter.old_spte;

but this may not reach the goal of 'improved readability' :-)

Also, we may add an MMU_DEBUG ifdef-ed check that sptes[0] remains
intact.

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
  2020-12-18  0:31 ` [PATCH 4/4] KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte() Sean Christopherson
@ 2020-12-18  9:33   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2020-12-18  9:33 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

Sean Christopherson <seanjc@google.com> writes:

> Check only the terminal leaf for a "!PRESENT || MMIO" SPTE when looking
> for reserved bits on valid, non-MMIO SPTEs.  The get_walk() helpers
> terminate their walks if a not-present or MMIO SPTE is encountered, i.e.
> the non-terminal SPTEs have already been verified to be regular SPTEs.
> This eliminates an extra check-and-branch in a relatively hot loop.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4798a4472066..769855f5f0a1 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3511,7 +3511,7 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
>  	return leaf;
>  }
>  
> -/* return true if reserved bit is detected on spte. */
> +/* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
>  static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  {
>  	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
> @@ -3534,11 +3534,20 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  		return reserved;
>  	}
>  
> +	*sptep = sptes[leaf];
> +
> +	/*
> +	 * Skip reserved bits checks on the terminal leaf if it's not a valid
> +	 * SPTE.  Note, this also (intentionally) skips MMIO SPTEs, which, by
> +	 * design, always have reserved bits set.  The purpose of the checks is
> +	 * to detect reserved bits on non-MMIO SPTEs. i.e. buggy SPTEs.
> +	 */
> +	if (!is_shadow_present_pte(sptes[leaf]))
> +		leaf++;
> +
>  	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
>  
> -	for (level = root; level >= leaf; level--) {
> -		if (!is_shadow_present_pte(sptes[level]))
> -			break;
> +	for (level = root; level >= leaf; level--)
>  		/*
>  		 * Use a bitwise-OR instead of a logical-OR to aggregate the
>  		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
> @@ -3546,7 +3555,6 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  		 */
>  		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
>  			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
> -	}
>  
>  	if (reserved) {
>  		pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump hierarchy:\n",
> @@ -3556,8 +3564,6 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  			       sptes[level], level);
>  	}
>  
> -	*sptep = sptes[leaf];
> -
>  	return reserved;
>  }

FWIW (as I got a bit lost in tdp-mmu code when checking that
is_shadow_present_pte() is always performed)

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
  2020-12-18  8:58   ` Vitaly Kuznetsov
@ 2020-12-21 17:05     ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2020-12-21 17:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

On Fri, Dec 18, 2020, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > Return -1 from the get_walk() helpers if the shadow walk doesn't fill at
> > least one spte, which can theoretically happen if the walk hits a
> > not-present PTPDR.  Returning the root level in such a case will cause
> 
> PDPTR

Doh.

> > get_mmio_spte() to return garbage (uninitialized stack data).  In
> > practice, such a scenario should be impossible as KVM shouldn't get a
> > reserved-bit page fault with a not-present PDPTR.
> >
> > Note, using mmu->root_level in get_walk() is wrong for other reasons,
> > too, but that's now a moot point.
> >
> > Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
> > Cc: Ben Gardon <bgardon@google.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c     | 7 ++++++-
> >  arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7a6ae9e90bd7..a48cd12c01d7 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3488,7 +3488,7 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
> >  static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes)
> >  {
> >  	struct kvm_shadow_walk_iterator iterator;
> > -	int leaf = vcpu->arch.mmu->root_level;
> > +	int leaf = -1;
> >  	u64 spte;
> >  
> >  
> > @@ -3532,6 +3532,11 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
> >  	else
> >  		leaf = get_walk(vcpu, addr, sptes);
> >  
> > +	if (unlikely(leaf < 0)) {
> > +		*sptep = 0ull;
> > +		return reserved;
> > +	}
> 
> When SPTE=0 is returned from get_mmio_spte(), handle_mmio_page_fault()
> will return RET_PF_RETRY -- should it be RET_PF_INVALID instead?

No, RET_PF_RETRY is the most appropriate.  A pae_root entry will only be zero if
the corresponding guest PDPTR is !PRESENT, i.e. the page fault is effectively in
the guest context.  The reason I say it should be an impossible condition is
because KVM should also reset the MMU whenever it snapshots the guest's PDPTRs,
i.e. it should be impossible to install a MMIO SPTE if the relevant PDPTR is
!PRESENT, and all MMIO SPTEs should be wiped out if the PDPTRs are reloaded.
I suppose by that argument, this should be a WARN_ON_ONCE, but I'm not sure if
I'm _that_ confident in my analysis :-)

Related side topic, this snippet in get_mmio_spte() is dead code, as the same
check is performed by its sole caller.  I'll send a patch to remove it (unless
Paolo wants a v2 of this series, in which case I'll tack it on the end).

	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
		*sptep = 0ull;
		return reserved;
	}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  2020-12-18  9:10   ` Vitaly Kuznetsov
@ 2020-12-21 18:24     ` Paolo Bonzini
  2020-12-21 18:30       ` Sean Christopherson
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2020-12-21 18:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Sean Christopherson
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	Ben Gardon, Richard Herbert

On 18/12/20 10:10, Vitaly Kuznetsov wrote:
>> -	int root = vcpu->arch.mmu->shadow_root_level;
>> -	int leaf;
>> -	int level;
>> +	int root, leaf, level;
>>   	bool reserved = false;
> Personal taste: I would've renamed 'root' to 'root_level' (to be
> consistent with get_walk()/kvm_tdp_mmu_get_walk()) and 'level' to
> e.g. 'l' as it's only being used as an interator ('i' would also do).

Maybe agree on the former, not really on the latter. :)

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte()
  2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
                   ` (4 preceding siblings ...)
       [not found] ` <2346556.XAFRqVoOGU@starbug.dom>
@ 2020-12-21 18:26 ` Paolo Bonzini
  5 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2020-12-21 18:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

On 18/12/20 01:31, Sean Christopherson wrote:
> Two fixes for bugs that were introduced along with the TDP MMU (though I
> strongly suspect only the one reported by Richard, fixed in patch 2/4, is
> hittable in practice).  Two additional cleanup on top to try and make the
> code a bit more readable and shave a few cycles.
> 
> Sean Christopherson (4):
>    KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
>    KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
>    KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
>    KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
> 
>   arch/x86/kvm/mmu/mmu.c     | 53 +++++++++++++++++++++-----------------
>   arch/x86/kvm/mmu/tdp_mmu.c |  9 ++++---
>   arch/x86/kvm/mmu/tdp_mmu.h |  4 ++-
>   3 files changed, 39 insertions(+), 27 deletions(-)
> 

Queued, thanks (and thanks for Ccing stable on the first two already :)).

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  2020-12-21 18:24     ` Paolo Bonzini
@ 2020-12-21 18:30       ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2020-12-21 18:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon, Richard Herbert

On Mon, Dec 21, 2020, Paolo Bonzini wrote:
> On 18/12/20 10:10, Vitaly Kuznetsov wrote:
> > > -	int root = vcpu->arch.mmu->shadow_root_level;
> > > -	int leaf;
> > > -	int level;
> > > +	int root, leaf, level;
> > >   	bool reserved = false;
> > Personal taste: I would've renamed 'root' to 'root_level' (to be
> > consistent with get_walk()/kvm_tdp_mmu_get_walk()) and 'level' to
> > e.g. 'l' as it's only being used as an interator ('i' would also do).
> 
> Maybe agree on the former, not really on the latter. :)

Same here.  I kept 'root' to reduce code churn, even though I'd probably have
used 'root_level' if I were writing from scratch.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-12-21 18:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18  0:31 [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte() Sean Christopherson
2020-12-18  0:31 ` [PATCH 1/4] KVM: x86/mmu: Use -1 to flag an undefined spte " Sean Christopherson
2020-12-18  8:58   ` Vitaly Kuznetsov
2020-12-21 17:05     ` Sean Christopherson
2020-12-18  0:31 ` [PATCH 2/4] KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE Sean Christopherson
2020-12-18  9:10   ` Vitaly Kuznetsov
2020-12-21 18:24     ` Paolo Bonzini
2020-12-21 18:30       ` Sean Christopherson
2020-12-18  0:31 ` [PATCH 3/4] KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array Sean Christopherson
2020-12-18  9:18   ` Vitaly Kuznetsov
2020-12-18  0:31 ` [PATCH 4/4] KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte() Sean Christopherson
2020-12-18  9:33   ` Vitaly Kuznetsov
     [not found] ` <2346556.XAFRqVoOGU@starbug.dom>
2020-12-18  1:27   ` [PATCH 0/4] KVM: x86/mmu: Bug fixes and cleanups " Richard Herbert
2020-12-21 18:26 ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).