All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work
@ 2015-11-06  7:20 Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:20 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Patch 1/2/3 are easy ones.

Following two, patch 4/5, may not be ideal solutions, but at least
explain, or try to explain, the problems.

Takuya Yoshikawa (5):
  KVM: x86: MMU: Remove unused parameter of __direct_map()
  KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  KVM: x86: MMU: Make mmu_set_spte() return emulate value
  KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes

 Documentation/virtual/kvm/mmu.txt |   4 +-
 arch/x86/kvm/mmu.c                | 118 ++++++++++++++++++++------------------
 arch/x86/kvm/mmu_audit.c          |   2 +-
 arch/x86/kvm/paging_tmpl.h        |  10 ++--
 4 files changed, 70 insertions(+), 64 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map()
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
@ 2015-11-06  7:21 ` Takuya Yoshikawa
  2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:21 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7d85bca..a76bc04 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,9 +2708,8 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
 	__direct_pte_prefetch(vcpu, sp, sptep);
 }
 
-static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
-			int map_writable, int level, gfn_t gfn, pfn_t pfn,
-			bool prefault)
+static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
+			int level, gfn_t gfn, pfn_t pfn, bool prefault)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
@@ -3018,8 +3017,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, v, write, map_writable, level, gfn, pfn,
-			 prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 
@@ -3541,8 +3539,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, gpa, write, map_writable,
-			 level, gfn, pfn, prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 	return r;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
@ 2015-11-06  7:22 ` Takuya Yoshikawa
  2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:22 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Both __mmu_unsync_walk() and mmu_pages_clear_parents() have three line
code which clears a bit in the unsync child bitmap; the former places it
inside a loop block and uses a few goto statements to jump to it.

A new helper function, clear_unsync_child_bit(), makes the code cleaner.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a76bc04..a9622a2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1806,6 +1806,13 @@ static int mmu_pages_add(struct kvm_mmu_pages *pvec, struct kvm_mmu_page *sp,
 	return (pvec->nr == KVM_PAGE_ARRAY_NR);
 }
 
+static inline void clear_unsync_child_bit(struct kvm_mmu_page *sp, int idx)
+{
+	--sp->unsync_children;
+	WARN_ON((int)sp->unsync_children < 0);
+	__clear_bit(idx, sp->unsync_child_bitmap);
+}
+
 static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 			   struct kvm_mmu_pages *pvec)
 {
@@ -1815,8 +1822,10 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 		struct kvm_mmu_page *child;
 		u64 ent = sp->spt[i];
 
-		if (!is_shadow_present_pte(ent) || is_large_pte(ent))
-			goto clear_child_bitmap;
+		if (!is_shadow_present_pte(ent) || is_large_pte(ent)) {
+			clear_unsync_child_bit(sp, i);
+			continue;
+		}
 
 		child = page_header(ent & PT64_BASE_ADDR_MASK);
 
@@ -1825,28 +1834,21 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 				return -ENOSPC;
 
 			ret = __mmu_unsync_walk(child, pvec);
-			if (!ret)
-				goto clear_child_bitmap;
-			else if (ret > 0)
+			if (!ret) {
+				clear_unsync_child_bit(sp, i);
+				continue;
+			} else if (ret > 0) {
 				nr_unsync_leaf += ret;
-			else
+			} else
 				return ret;
 		} else if (child->unsync) {
 			nr_unsync_leaf++;
 			if (mmu_pages_add(pvec, child, i))
 				return -ENOSPC;
 		} else
-			 goto clear_child_bitmap;
-
-		continue;
-
-clear_child_bitmap:
-		__clear_bit(i, sp->unsync_child_bitmap);
-		sp->unsync_children--;
-		WARN_ON((int)sp->unsync_children < 0);
+			clear_unsync_child_bit(sp, i);
 	}
 
-
 	return nr_unsync_leaf;
 }
 
@@ -2009,9 +2011,7 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
 		if (!sp)
 			return;
 
-		--sp->unsync_children;
-		WARN_ON((int)sp->unsync_children < 0);
-		__clear_bit(idx, sp->unsync_child_bitmap);
+		clear_unsync_child_bit(sp, idx);
 		level++;
 	} while (level < PT64_ROOT_LEVEL-1 && !sp->unsync_children);
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
  2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
@ 2015-11-06  7:23 ` Takuya Yoshikawa
  2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:23 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

mmu_set_spte()'s code is based on the assumption that the emulate
parameter has a valid pointer value if set_spte() returns true and
write_fault is not zero.  In other cases, emulate may be NULL, so a
NULL-check is needed.

Stop passing emulate pointer and make mmu_set_spte() return the emulate
value instead to clean up this complex interface.  Prefetch functions
can just throw away the return value.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c         | 27 ++++++++++++++-------------
 arch/x86/kvm/paging_tmpl.h | 10 +++++-----
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a9622a2..69e7d20 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2564,13 +2564,13 @@ done:
 	return ret;
 }
 
-static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			 unsigned pte_access, int write_fault, int *emulate,
-			 int level, gfn_t gfn, pfn_t pfn, bool speculative,
-			 bool host_writable)
+static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
+			 int write_fault, int level, gfn_t gfn, pfn_t pfn,
+			 bool speculative, bool host_writable)
 {
 	int was_rmapped = 0;
 	int rmap_count;
+	bool emulate = false;
 
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
@@ -2600,12 +2600,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	if (set_spte(vcpu, sptep, pte_access, level, gfn, pfn, speculative,
 	      true, host_writable)) {
 		if (write_fault)
-			*emulate = 1;
+			emulate = true;
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
 	}
 
-	if (unlikely(is_mmio_spte(*sptep) && emulate))
-		*emulate = 1;
+	if (unlikely(is_mmio_spte(*sptep)))
+		emulate = true;
 
 	pgprintk("%s: setting spte %llx\n", __func__, *sptep);
 	pgprintk("instantiating %s PTE (%s) at %llx (%llx) addr %p\n",
@@ -2624,6 +2624,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	}
 
 	kvm_release_pfn_clean(pfn);
+
+	return emulate;
 }
 
 static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
@@ -2658,9 +2660,8 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 		return -1;
 
 	for (i = 0; i < ret; i++, gfn++, start++)
-		mmu_set_spte(vcpu, start, access, 0, NULL,
-			     sp->role.level, gfn, page_to_pfn(pages[i]),
-			     true, true);
+		mmu_set_spte(vcpu, start, access, 0, sp->role.level, gfn,
+			     page_to_pfn(pages[i]), true, true);
 
 	return 0;
 }
@@ -2721,9 +2722,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
 
 	for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
 		if (iterator.level == level) {
-			mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
-				     write, &emulate, level, gfn, pfn,
-				     prefault, map_writable);
+			emulate = mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
+					       write, level, gfn, pfn, prefault,
+					       map_writable);
 			direct_pte_prefetch(vcpu, iterator.sptep);
 			++vcpu->stat.pf_fixed;
 			break;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index b41faa9..de24499 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -475,8 +475,8 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	 * we call mmu_set_spte() with host_writable = true because
 	 * pte_prefetch_gfn_to_pfn always gets a writable pfn.
 	 */
-	mmu_set_spte(vcpu, spte, pte_access, 0, NULL, PT_PAGE_TABLE_LEVEL,
-		     gfn, pfn, true, true);
+	mmu_set_spte(vcpu, spte, pte_access, 0, PT_PAGE_TABLE_LEVEL, gfn, pfn,
+		     true, true);
 
 	return true;
 }
@@ -556,7 +556,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	struct kvm_mmu_page *sp = NULL;
 	struct kvm_shadow_walk_iterator it;
 	unsigned direct_access, access = gw->pt_access;
-	int top_level, emulate = 0;
+	int top_level, emulate;
 
 	direct_access = gw->pte_access;
 
@@ -622,8 +622,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	}
 
 	clear_sp_write_flooding_count(it.sptep);
-	mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault, &emulate,
-		     it.level, gw->gfn, pfn, prefault, map_writable);
+	emulate = mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault,
+			       it.level, gw->gfn, pfn, prefault, map_writable);
 	FNAME(pte_prefetch)(vcpu, gw, it.sptep);
 
 	return emulate;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
@ 2015-11-06  7:24 ` Takuya Yoshikawa
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
  2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:24 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

is_rmap_spte(), originally named is_rmap_pte(), was introduced when the
simple reverse mapping was implemented by commit cd4a4e5374110444
("[PATCH] KVM: MMU: Implement simple reverse mapping").  At that point,
its role was clear and only rmap_add() and rmap_remove() were using it
to select sptes that need to be reverse-mapped.

Independently of that, is_shadow_present_pte() was first introduced by
commit c7addb902054195b ("KVM: Allow not-present guest page faults to
bypass kvm") to do bypass_guest_pf optimization, which does not exist
any more.

These two seem to have changed their roles somewhat, and is_rmap_spte()
just calls is_shadow_present_pte() now.

Since using both of them without no clear distinction just makes the
code confusing, remove is_rmap_spte().

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c       | 13 ++++---------
 arch/x86/kvm/mmu_audit.c |  2 +-
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69e7d20..c5e2363 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -311,11 +311,6 @@ static int is_large_pte(u64 pte)
 	return pte & PT_PAGE_SIZE_MASK;
 }
 
-static int is_rmap_spte(u64 pte)
-{
-	return is_shadow_present_pte(pte);
-}
-
 static int is_last_spte(u64 pte, int level)
 {
 	if (level == PT_PAGE_TABLE_LEVEL)
@@ -540,7 +535,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
 	u64 old_spte = *sptep;
 	bool ret = false;
 
-	WARN_ON(!is_rmap_spte(new_spte));
+	WARN_ON(!is_shadow_present_pte(new_spte));
 
 	if (!is_shadow_present_pte(old_spte)) {
 		mmu_spte_set(sptep, new_spte);
@@ -595,7 +590,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep)
 	else
 		old_spte = __update_clear_spte_slow(sptep, 0ull);
 
-	if (!is_rmap_spte(old_spte))
+	if (!is_shadow_present_pte(old_spte))
 		return 0;
 
 	pfn = spte_to_pfn(old_spte);
@@ -2575,7 +2570,7 @@ static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
 
-	if (is_rmap_spte(*sptep)) {
+	if (is_shadow_present_pte(*sptep)) {
 		/*
 		 * If we overwrite a PTE page pointer with a 2MB PMD, unlink
 		 * the parent of the now unreachable PTE.
@@ -2919,7 +2914,7 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 	 * If the mapping has been changed, let the vcpu fault on the
 	 * same address again.
 	 */
-	if (!is_rmap_spte(spte)) {
+	if (!is_shadow_present_pte(spte)) {
 		ret = true;
 		goto exit;
 	}
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index 03d518e..90ee420 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -183,7 +183,7 @@ static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
 		return;
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-		if (!is_rmap_spte(sp->spt[i]))
+		if (!is_shadow_present_pte(sp->spt[i]))
 			continue;
 
 		inspect_spte_has_rmap(kvm, sp->spt + i);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (3 preceding siblings ...)
  2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
@ 2015-11-06  7:25 ` Takuya Yoshikawa
  2015-11-09 10:14   ` Paolo Bonzini
  2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini
  5 siblings, 1 reply; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:25 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
placed right after the call to detect unrelated sptes which should not
be found in the reverse-mapping list.

Move this check in rmap_get_first/next() so that all call sites, not
just the users of the for_each_rmap_spte() macro, will be checked the
same way.  In addition, change the BUG_ON to WARN_ON since killing the
whole host is the last thing that KVM should try.

One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
rmap_get_first() to handle parent sptes.  The change will not break it
because parent sptes are present, at least until drop_parent_pte()
actually unlinks them, and not mmio-sptes.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 Documentation/virtual/kvm/mmu.txt |  4 ++--
 arch/x86/kvm/mmu.c                | 31 ++++++++++++++++++++++---------
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index 3a4d681..daf9c0f 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -203,10 +203,10 @@ Shadow pages contain the following information:
     page cannot be destroyed.  See role.invalid.
   parent_ptes:
     The reverse mapping for the pte/ptes pointing at this page's spt. If
-    parent_ptes bit 0 is zero, only one spte points at this pages and
+    parent_ptes bit 0 is zero, only one spte points at this page and
     parent_ptes points at this single spte, otherwise, there exists multiple
     sptes pointing at this page and (parent_ptes & ~0x1) points at a data
-    structure with a list of parent_ptes.
+    structure with a list of parent sptes.
   unsync:
     If true, then the translations in this page may not match the guest's
     translation.  This is equivalent to the state of the tlb when a pte is
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c5e2363..353d752 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1099,17 +1099,28 @@ struct rmap_iterator {
  */
 static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (!rmap)
 		return NULL;
 
 	if (!(rmap & 1)) {
 		iter->desc = NULL;
-		return (u64 *)rmap;
+		sptep = (u64 *)rmap;
+		goto out;
 	}
 
 	iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
 	iter->pos = 0;
-	return iter->desc->sptes[iter->pos];
+	sptep = iter->desc->sptes[iter->pos];
+out:
+	/*
+	 * Parent sptes found in sp->parent_ptes lists are also checked here
+	 * since kvm_mmu_unlink_parents() uses this function.  If the condition
+	 * needs to be changed for them, make another wrapper function.
+	 */
+	WARN_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 /*
@@ -1119,14 +1130,14 @@ static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
  */
 static u64 *rmap_get_next(struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (iter->desc) {
 		if (iter->pos < PTE_LIST_EXT - 1) {
-			u64 *sptep;
-
 			++iter->pos;
 			sptep = iter->desc->sptes[iter->pos];
 			if (sptep)
-				return sptep;
+				goto out;
 		}
 
 		iter->desc = iter->desc->more;
@@ -1134,17 +1145,20 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
 		if (iter->desc) {
 			iter->pos = 0;
 			/* desc->sptes[0] cannot be NULL */
-			return iter->desc->sptes[iter->pos];
+			sptep = iter->desc->sptes[iter->pos];
+			goto out;
 		}
 	}
 
 	return NULL;
+out:
+	WARN_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 #define for_each_rmap_spte(_rmap_, _iter_, _spte_)			    \
 	   for (_spte_ = rmap_get_first(*_rmap_, _iter_);		    \
-		_spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});  \
-			_spte_ = rmap_get_next(_iter_))
+		_spte_; _spte_ = rmap_get_next(_iter_))
 
 static void drop_spte(struct kvm *kvm, u64 *sptep)
 {
@@ -1358,7 +1372,6 @@ static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
 	bool flush = false;
 
 	while ((sptep = rmap_get_first(*rmapp, &iter))) {
-		BUG_ON(!(*sptep & PT_PRESENT_MASK));
 		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
 
 		drop_spte(kvm, sptep);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
@ 2015-11-09 10:14   ` Paolo Bonzini
  2015-11-10  9:05     ` Takuya Yoshikawa
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-09 10:14 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel

On 06/11/2015 08:25, Takuya Yoshikawa wrote:
> At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
> placed right after the call to detect unrelated sptes which should not
> be found in the reverse-mapping list.
> 
> Move this check in rmap_get_first/next() so that all call sites, not
> just the users of the for_each_rmap_spte() macro, will be checked the
> same way.  In addition, change the BUG_ON to WARN_ON since killing the
> whole host is the last thing that KVM should try.
> 
> One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
> rmap_get_first() to handle parent sptes.  The change will not break it
> because parent sptes are present, at least until drop_parent_pte()
> actually unlinks them, and not mmio-sptes.

Can you also change kvm_mmu_mark_parents_unsync to use
for_each_rmap_spte instead of pte_list_walk?  It is the last use of
pte_list_walk, and it's nice if we have two uses of for_each_rmap_spte
with parent_ptes as the argument.

BTW, on my todo list is to change the rmap items to a struct (with a
single u64 inside) for type safety.  Since you are touching this code,
perhaps you can give it a shot?

Paolo

> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>  Documentation/virtual/kvm/mmu.txt |  4 ++--
>  arch/x86/kvm/mmu.c                | 31 ++++++++++++++++++++++---------
>  2 files changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
> index 3a4d681..daf9c0f 100644
> --- a/Documentation/virtual/kvm/mmu.txt
> +++ b/Documentation/virtual/kvm/mmu.txt
> @@ -203,10 +203,10 @@ Shadow pages contain the following information:
>      page cannot be destroyed.  See role.invalid.
>    parent_ptes:
>      The reverse mapping for the pte/ptes pointing at this page's spt. If
> -    parent_ptes bit 0 is zero, only one spte points at this pages and
> +    parent_ptes bit 0 is zero, only one spte points at this page and
>      parent_ptes points at this single spte, otherwise, there exists multiple
>      sptes pointing at this page and (parent_ptes & ~0x1) points at a data
> -    structure with a list of parent_ptes.
> +    structure with a list of parent sptes.
>    unsync:
>      If true, then the translations in this page may not match the guest's
>      translation.  This is equivalent to the state of the tlb when a pte is
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index c5e2363..353d752 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1099,17 +1099,28 @@ struct rmap_iterator {
>   */
>  static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
>  {
> +	u64 *sptep;
> +
>  	if (!rmap)
>  		return NULL;
>  
>  	if (!(rmap & 1)) {
>  		iter->desc = NULL;
> -		return (u64 *)rmap;
> +		sptep = (u64 *)rmap;
> +		goto out;
>  	}
>  
>  	iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
>  	iter->pos = 0;
> -	return iter->desc->sptes[iter->pos];
> +	sptep = iter->desc->sptes[iter->pos];
> +out:
> +	/*
> +	 * Parent sptes found in sp->parent_ptes lists are also checked here
> +	 * since kvm_mmu_unlink_parents() uses this function.  If the condition
> +	 * needs to be changed for them, make another wrapper function.
> +	 */
> +	WARN_ON(!is_shadow_present_pte(*sptep));
> +	return sptep;
>  }
>  
>  /*
> @@ -1119,14 +1130,14 @@ static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
>   */
>  static u64 *rmap_get_next(struct rmap_iterator *iter)
>  {
> +	u64 *sptep;
> +
>  	if (iter->desc) {
>  		if (iter->pos < PTE_LIST_EXT - 1) {
> -			u64 *sptep;
> -
>  			++iter->pos;
>  			sptep = iter->desc->sptes[iter->pos];
>  			if (sptep)
> -				return sptep;
> +				goto out;
>  		}
>  
>  		iter->desc = iter->desc->more;
> @@ -1134,17 +1145,20 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
>  		if (iter->desc) {
>  			iter->pos = 0;
>  			/* desc->sptes[0] cannot be NULL */
> -			return iter->desc->sptes[iter->pos];
> +			sptep = iter->desc->sptes[iter->pos];
> +			goto out;
>  		}
>  	}
>  
>  	return NULL;
> +out:
> +	WARN_ON(!is_shadow_present_pte(*sptep));
> +	return sptep;
>  }
>  
>  #define for_each_rmap_spte(_rmap_, _iter_, _spte_)			    \
>  	   for (_spte_ = rmap_get_first(*_rmap_, _iter_);		    \
> -		_spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});  \
> -			_spte_ = rmap_get_next(_iter_))
> +		_spte_; _spte_ = rmap_get_next(_iter_))
>  
>  static void drop_spte(struct kvm *kvm, u64 *sptep)
>  {
> @@ -1358,7 +1372,6 @@ static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
>  	bool flush = false;
>  
>  	while ((sptep = rmap_get_first(*rmapp, &iter))) {
> -		BUG_ON(!(*sptep & PT_PRESENT_MASK));
>  		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
>  
>  		drop_spte(kvm, sptep);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (4 preceding siblings ...)
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
@ 2015-11-09 10:15 ` Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-09 10:15 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel



On 06/11/2015 08:20, Takuya Yoshikawa wrote:
> Patch 1/2/3 are easy ones.
> 
> Following two, patch 4/5, may not be ideal solutions, but at least
> explain, or try to explain, the problems.

They are okay!  I replied to patch 5 with a suggestion for further
cleanup.  I'll apply them for 4.5.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-09 10:14   ` Paolo Bonzini
@ 2015-11-10  9:05     ` Takuya Yoshikawa
  2015-11-10  9:18       ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-10  9:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel

On 2015/11/09 19:14, Paolo Bonzini wrote:
> Can you also change kvm_mmu_mark_parents_unsync to use
> for_each_rmap_spte instead of pte_list_walk?  It is the last use of
> pte_list_walk, and it's nice if we have two uses of for_each_rmap_spte
> with parent_ptes as the argument.

No problem, I will do.

Since parent_ptes is also explained as the "reverse mapping" list of
parent sptes (in mmu.txt and kvm_host.h), using rmap helpers will not
look so strange.

> BTW, on my todo list is to change the rmap items to a struct (with a
> single u64 inside) for type safety.  Since you are touching this code,
> perhaps you can give it a shot?

Yes, almost done here (assuming that you mean 'unsigned long').
But I have some candidates for its name in mind:

1. struct kvm_rmap { unsigned long val; };
2. struct kvm_rmap_head { unsigned long val; };
3. struct kvm_rmap_list_head { unsigned long val; };
4. struct kvm_spte_list_head { unsigned long val; };

Since this is the head of the reverse mapping list of sptes, I thought
name 3 might be the best and first made a patch with it, but it was
a bit longer than I had hoped it to be.

I have changed it to name 2, and it looks a bit nicer now, but even
shorter, e.g. name 1, may be good as well.

Do you have any preference?

   Takuya


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-10  9:05     ` Takuya Yoshikawa
@ 2015-11-10  9:18       ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-10  9:18 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel



On 10/11/2015 10:05, Takuya Yoshikawa wrote:
> 
> 
>> BTW, on my todo list is to change the rmap items to a struct (with a
>> single u64 inside) for type safety.  Since you are touching this code,
>> perhaps you can give it a shot?
> 
> Yes, almost done here (assuming that you mean 'unsigned long').

Yes, thanks!

> But I have some candidates for its name in mind:
> 
> 1. struct kvm_rmap { unsigned long val; };
> 2. struct kvm_rmap_head { unsigned long val; };
> 3. struct kvm_rmap_list_head { unsigned long val; };
> 4. struct kvm_spte_list_head { unsigned long val; };
> 
> Since this is the head of the reverse mapping list of sptes, I thought
> name 3 might be the best and first made a patch with it, but it was
> a bit longer than I had hoped it to be.
> 
> I have changed it to name 2, and it looks a bit nicer now, but even
> shorter, e.g. name 1, may be good as well.
> 
> Do you have any preference?

I like kvm_rmap_head.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-11-10  9:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
2015-11-09 10:14   ` Paolo Bonzini
2015-11-10  9:05     ` Takuya Yoshikawa
2015-11-10  9:18       ` Paolo Bonzini
2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.