linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work
@ 2015-11-06  7:20 Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:20 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Patch 1/2/3 are easy ones.

Following two, patch 4/5, may not be ideal solutions, but at least
explain, or try to explain, the problems.

Takuya Yoshikawa (5):
  KVM: x86: MMU: Remove unused parameter of __direct_map()
  KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  KVM: x86: MMU: Make mmu_set_spte() return emulate value
  KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes

 Documentation/virtual/kvm/mmu.txt |   4 +-
 arch/x86/kvm/mmu.c                | 118 ++++++++++++++++++++------------------
 arch/x86/kvm/mmu_audit.c          |   2 +-
 arch/x86/kvm/paging_tmpl.h        |  10 ++--
 4 files changed, 70 insertions(+), 64 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map()
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
@ 2015-11-06  7:21 ` Takuya Yoshikawa
  2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:21 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7d85bca..a76bc04 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,9 +2708,8 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
 	__direct_pte_prefetch(vcpu, sp, sptep);
 }
 
-static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
-			int map_writable, int level, gfn_t gfn, pfn_t pfn,
-			bool prefault)
+static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
+			int level, gfn_t gfn, pfn_t pfn, bool prefault)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
@@ -3018,8 +3017,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, v, write, map_writable, level, gfn, pfn,
-			 prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 
@@ -3541,8 +3539,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, gpa, write, map_writable,
-			 level, gfn, pfn, prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 	return r;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
@ 2015-11-06  7:22 ` Takuya Yoshikawa
  2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:22 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

Both __mmu_unsync_walk() and mmu_pages_clear_parents() have three line
code which clears a bit in the unsync child bitmap; the former places it
inside a loop block and uses a few goto statements to jump to it.

A new helper function, clear_unsync_child_bit(), makes the code cleaner.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a76bc04..a9622a2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1806,6 +1806,13 @@ static int mmu_pages_add(struct kvm_mmu_pages *pvec, struct kvm_mmu_page *sp,
 	return (pvec->nr == KVM_PAGE_ARRAY_NR);
 }
 
+static inline void clear_unsync_child_bit(struct kvm_mmu_page *sp, int idx)
+{
+	--sp->unsync_children;
+	WARN_ON((int)sp->unsync_children < 0);
+	__clear_bit(idx, sp->unsync_child_bitmap);
+}
+
 static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 			   struct kvm_mmu_pages *pvec)
 {
@@ -1815,8 +1822,10 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 		struct kvm_mmu_page *child;
 		u64 ent = sp->spt[i];
 
-		if (!is_shadow_present_pte(ent) || is_large_pte(ent))
-			goto clear_child_bitmap;
+		if (!is_shadow_present_pte(ent) || is_large_pte(ent)) {
+			clear_unsync_child_bit(sp, i);
+			continue;
+		}
 
 		child = page_header(ent & PT64_BASE_ADDR_MASK);
 
@@ -1825,28 +1834,21 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 				return -ENOSPC;
 
 			ret = __mmu_unsync_walk(child, pvec);
-			if (!ret)
-				goto clear_child_bitmap;
-			else if (ret > 0)
+			if (!ret) {
+				clear_unsync_child_bit(sp, i);
+				continue;
+			} else if (ret > 0) {
 				nr_unsync_leaf += ret;
-			else
+			} else
 				return ret;
 		} else if (child->unsync) {
 			nr_unsync_leaf++;
 			if (mmu_pages_add(pvec, child, i))
 				return -ENOSPC;
 		} else
-			 goto clear_child_bitmap;
-
-		continue;
-
-clear_child_bitmap:
-		__clear_bit(i, sp->unsync_child_bitmap);
-		sp->unsync_children--;
-		WARN_ON((int)sp->unsync_children < 0);
+			clear_unsync_child_bit(sp, i);
 	}
 
-
 	return nr_unsync_leaf;
 }
 
@@ -2009,9 +2011,7 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
 		if (!sp)
 			return;
 
-		--sp->unsync_children;
-		WARN_ON((int)sp->unsync_children < 0);
-		__clear_bit(idx, sp->unsync_child_bitmap);
+		clear_unsync_child_bit(sp, idx);
 		level++;
 	} while (level < PT64_ROOT_LEVEL-1 && !sp->unsync_children);
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
  2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
@ 2015-11-06  7:23 ` Takuya Yoshikawa
  2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:23 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

mmu_set_spte()'s code is based on the assumption that the emulate
parameter has a valid pointer value if set_spte() returns true and
write_fault is not zero.  In other cases, emulate may be NULL, so a
NULL-check is needed.

Stop passing emulate pointer and make mmu_set_spte() return the emulate
value instead to clean up this complex interface.  Prefetch functions
can just throw away the return value.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c         | 27 ++++++++++++++-------------
 arch/x86/kvm/paging_tmpl.h | 10 +++++-----
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a9622a2..69e7d20 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2564,13 +2564,13 @@ done:
 	return ret;
 }
 
-static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			 unsigned pte_access, int write_fault, int *emulate,
-			 int level, gfn_t gfn, pfn_t pfn, bool speculative,
-			 bool host_writable)
+static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
+			 int write_fault, int level, gfn_t gfn, pfn_t pfn,
+			 bool speculative, bool host_writable)
 {
 	int was_rmapped = 0;
 	int rmap_count;
+	bool emulate = false;
 
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
@@ -2600,12 +2600,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	if (set_spte(vcpu, sptep, pte_access, level, gfn, pfn, speculative,
 	      true, host_writable)) {
 		if (write_fault)
-			*emulate = 1;
+			emulate = true;
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
 	}
 
-	if (unlikely(is_mmio_spte(*sptep) && emulate))
-		*emulate = 1;
+	if (unlikely(is_mmio_spte(*sptep)))
+		emulate = true;
 
 	pgprintk("%s: setting spte %llx\n", __func__, *sptep);
 	pgprintk("instantiating %s PTE (%s) at %llx (%llx) addr %p\n",
@@ -2624,6 +2624,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	}
 
 	kvm_release_pfn_clean(pfn);
+
+	return emulate;
 }
 
 static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
@@ -2658,9 +2660,8 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 		return -1;
 
 	for (i = 0; i < ret; i++, gfn++, start++)
-		mmu_set_spte(vcpu, start, access, 0, NULL,
-			     sp->role.level, gfn, page_to_pfn(pages[i]),
-			     true, true);
+		mmu_set_spte(vcpu, start, access, 0, sp->role.level, gfn,
+			     page_to_pfn(pages[i]), true, true);
 
 	return 0;
 }
@@ -2721,9 +2722,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
 
 	for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
 		if (iterator.level == level) {
-			mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
-				     write, &emulate, level, gfn, pfn,
-				     prefault, map_writable);
+			emulate = mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
+					       write, level, gfn, pfn, prefault,
+					       map_writable);
 			direct_pte_prefetch(vcpu, iterator.sptep);
 			++vcpu->stat.pf_fixed;
 			break;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index b41faa9..de24499 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -475,8 +475,8 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	 * we call mmu_set_spte() with host_writable = true because
 	 * pte_prefetch_gfn_to_pfn always gets a writable pfn.
 	 */
-	mmu_set_spte(vcpu, spte, pte_access, 0, NULL, PT_PAGE_TABLE_LEVEL,
-		     gfn, pfn, true, true);
+	mmu_set_spte(vcpu, spte, pte_access, 0, PT_PAGE_TABLE_LEVEL, gfn, pfn,
+		     true, true);
 
 	return true;
 }
@@ -556,7 +556,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	struct kvm_mmu_page *sp = NULL;
 	struct kvm_shadow_walk_iterator it;
 	unsigned direct_access, access = gw->pt_access;
-	int top_level, emulate = 0;
+	int top_level, emulate;
 
 	direct_access = gw->pte_access;
 
@@ -622,8 +622,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	}
 
 	clear_sp_write_flooding_count(it.sptep);
-	mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault, &emulate,
-		     it.level, gw->gfn, pfn, prefault, map_writable);
+	emulate = mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault,
+			       it.level, gw->gfn, pfn, prefault, map_writable);
 	FNAME(pte_prefetch)(vcpu, gw, it.sptep);
 
 	return emulate;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
@ 2015-11-06  7:24 ` Takuya Yoshikawa
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
  2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:24 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

is_rmap_spte(), originally named is_rmap_pte(), was introduced when the
simple reverse mapping was implemented by commit cd4a4e5374110444
("[PATCH] KVM: MMU: Implement simple reverse mapping").  At that point,
its role was clear and only rmap_add() and rmap_remove() were using it
to select sptes that need to be reverse-mapped.

Independently of that, is_shadow_present_pte() was first introduced by
commit c7addb902054195b ("KVM: Allow not-present guest page faults to
bypass kvm") to do bypass_guest_pf optimization, which does not exist
any more.

These two seem to have changed their roles somewhat, and is_rmap_spte()
just calls is_shadow_present_pte() now.

Since using both of them without no clear distinction just makes the
code confusing, remove is_rmap_spte().

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c       | 13 ++++---------
 arch/x86/kvm/mmu_audit.c |  2 +-
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69e7d20..c5e2363 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -311,11 +311,6 @@ static int is_large_pte(u64 pte)
 	return pte & PT_PAGE_SIZE_MASK;
 }
 
-static int is_rmap_spte(u64 pte)
-{
-	return is_shadow_present_pte(pte);
-}
-
 static int is_last_spte(u64 pte, int level)
 {
 	if (level == PT_PAGE_TABLE_LEVEL)
@@ -540,7 +535,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
 	u64 old_spte = *sptep;
 	bool ret = false;
 
-	WARN_ON(!is_rmap_spte(new_spte));
+	WARN_ON(!is_shadow_present_pte(new_spte));
 
 	if (!is_shadow_present_pte(old_spte)) {
 		mmu_spte_set(sptep, new_spte);
@@ -595,7 +590,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep)
 	else
 		old_spte = __update_clear_spte_slow(sptep, 0ull);
 
-	if (!is_rmap_spte(old_spte))
+	if (!is_shadow_present_pte(old_spte))
 		return 0;
 
 	pfn = spte_to_pfn(old_spte);
@@ -2575,7 +2570,7 @@ static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
 
-	if (is_rmap_spte(*sptep)) {
+	if (is_shadow_present_pte(*sptep)) {
 		/*
 		 * If we overwrite a PTE page pointer with a 2MB PMD, unlink
 		 * the parent of the now unreachable PTE.
@@ -2919,7 +2914,7 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 	 * If the mapping has been changed, let the vcpu fault on the
 	 * same address again.
 	 */
-	if (!is_rmap_spte(spte)) {
+	if (!is_shadow_present_pte(spte)) {
 		ret = true;
 		goto exit;
 	}
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index 03d518e..90ee420 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -183,7 +183,7 @@ static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
 		return;
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-		if (!is_rmap_spte(sp->spt[i]))
+		if (!is_shadow_present_pte(sp->spt[i]))
 			continue;
 
 		inspect_spte_has_rmap(kvm, sp->spt + i);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (3 preceding siblings ...)
  2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
@ 2015-11-06  7:25 ` Takuya Yoshikawa
  2015-11-09 10:14   ` Paolo Bonzini
  2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini
  5 siblings, 1 reply; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-06  7:25 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel

At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
placed right after the call to detect unrelated sptes which should not
be found in the reverse-mapping list.

Move this check in rmap_get_first/next() so that all call sites, not
just the users of the for_each_rmap_spte() macro, will be checked the
same way.  In addition, change the BUG_ON to WARN_ON since killing the
whole host is the last thing that KVM should try.

One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
rmap_get_first() to handle parent sptes.  The change will not break it
because parent sptes are present, at least until drop_parent_pte()
actually unlinks them, and not mmio-sptes.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 Documentation/virtual/kvm/mmu.txt |  4 ++--
 arch/x86/kvm/mmu.c                | 31 ++++++++++++++++++++++---------
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index 3a4d681..daf9c0f 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -203,10 +203,10 @@ Shadow pages contain the following information:
     page cannot be destroyed.  See role.invalid.
   parent_ptes:
     The reverse mapping for the pte/ptes pointing at this page's spt. If
-    parent_ptes bit 0 is zero, only one spte points at this pages and
+    parent_ptes bit 0 is zero, only one spte points at this page and
     parent_ptes points at this single spte, otherwise, there exists multiple
     sptes pointing at this page and (parent_ptes & ~0x1) points at a data
-    structure with a list of parent_ptes.
+    structure with a list of parent sptes.
   unsync:
     If true, then the translations in this page may not match the guest's
     translation.  This is equivalent to the state of the tlb when a pte is
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c5e2363..353d752 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1099,17 +1099,28 @@ struct rmap_iterator {
  */
 static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (!rmap)
 		return NULL;
 
 	if (!(rmap & 1)) {
 		iter->desc = NULL;
-		return (u64 *)rmap;
+		sptep = (u64 *)rmap;
+		goto out;
 	}
 
 	iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
 	iter->pos = 0;
-	return iter->desc->sptes[iter->pos];
+	sptep = iter->desc->sptes[iter->pos];
+out:
+	/*
+	 * Parent sptes found in sp->parent_ptes lists are also checked here
+	 * since kvm_mmu_unlink_parents() uses this function.  If the condition
+	 * needs to be changed for them, make another wrapper function.
+	 */
+	WARN_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 /*
@@ -1119,14 +1130,14 @@ static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
  */
 static u64 *rmap_get_next(struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (iter->desc) {
 		if (iter->pos < PTE_LIST_EXT - 1) {
-			u64 *sptep;
-
 			++iter->pos;
 			sptep = iter->desc->sptes[iter->pos];
 			if (sptep)
-				return sptep;
+				goto out;
 		}
 
 		iter->desc = iter->desc->more;
@@ -1134,17 +1145,20 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
 		if (iter->desc) {
 			iter->pos = 0;
 			/* desc->sptes[0] cannot be NULL */
-			return iter->desc->sptes[iter->pos];
+			sptep = iter->desc->sptes[iter->pos];
+			goto out;
 		}
 	}
 
 	return NULL;
+out:
+	WARN_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 #define for_each_rmap_spte(_rmap_, _iter_, _spte_)			    \
 	   for (_spte_ = rmap_get_first(*_rmap_, _iter_);		    \
-		_spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});  \
-			_spte_ = rmap_get_next(_iter_))
+		_spte_; _spte_ = rmap_get_next(_iter_))
 
 static void drop_spte(struct kvm *kvm, u64 *sptep)
 {
@@ -1358,7 +1372,6 @@ static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
 	bool flush = false;
 
 	while ((sptep = rmap_get_first(*rmapp, &iter))) {
-		BUG_ON(!(*sptep & PT_PRESENT_MASK));
 		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
 
 		drop_spte(kvm, sptep);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
@ 2015-11-09 10:14   ` Paolo Bonzini
  2015-11-10  9:05     ` Takuya Yoshikawa
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-09 10:14 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel

On 06/11/2015 08:25, Takuya Yoshikawa wrote:
> At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
> placed right after the call to detect unrelated sptes which should not
> be found in the reverse-mapping list.
> 
> Move this check in rmap_get_first/next() so that all call sites, not
> just the users of the for_each_rmap_spte() macro, will be checked the
> same way.  In addition, change the BUG_ON to WARN_ON since killing the
> whole host is the last thing that KVM should try.
> 
> One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
> rmap_get_first() to handle parent sptes.  The change will not break it
> because parent sptes are present, at least until drop_parent_pte()
> actually unlinks them, and not mmio-sptes.

Can you also change kvm_mmu_mark_parents_unsync to use
for_each_rmap_spte instead of pte_list_walk?  It is the last use of
pte_list_walk, and it's nice if we have two uses of for_each_rmap_spte
with parent_ptes as the argument.

BTW, on my todo list is to change the rmap items to a struct (with a
single u64 inside) for type safety.  Since you are touching this code,
perhaps you can give it a shot?

Paolo

> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>  Documentation/virtual/kvm/mmu.txt |  4 ++--
>  arch/x86/kvm/mmu.c                | 31 ++++++++++++++++++++++---------
>  2 files changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
> index 3a4d681..daf9c0f 100644
> --- a/Documentation/virtual/kvm/mmu.txt
> +++ b/Documentation/virtual/kvm/mmu.txt
> @@ -203,10 +203,10 @@ Shadow pages contain the following information:
>      page cannot be destroyed.  See role.invalid.
>    parent_ptes:
>      The reverse mapping for the pte/ptes pointing at this page's spt. If
> -    parent_ptes bit 0 is zero, only one spte points at this pages and
> +    parent_ptes bit 0 is zero, only one spte points at this page and
>      parent_ptes points at this single spte, otherwise, there exists multiple
>      sptes pointing at this page and (parent_ptes & ~0x1) points at a data
> -    structure with a list of parent_ptes.
> +    structure with a list of parent sptes.
>    unsync:
>      If true, then the translations in this page may not match the guest's
>      translation.  This is equivalent to the state of the tlb when a pte is
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index c5e2363..353d752 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1099,17 +1099,28 @@ struct rmap_iterator {
>   */
>  static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
>  {
> +	u64 *sptep;
> +
>  	if (!rmap)
>  		return NULL;
>  
>  	if (!(rmap & 1)) {
>  		iter->desc = NULL;
> -		return (u64 *)rmap;
> +		sptep = (u64 *)rmap;
> +		goto out;
>  	}
>  
>  	iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
>  	iter->pos = 0;
> -	return iter->desc->sptes[iter->pos];
> +	sptep = iter->desc->sptes[iter->pos];
> +out:
> +	/*
> +	 * Parent sptes found in sp->parent_ptes lists are also checked here
> +	 * since kvm_mmu_unlink_parents() uses this function.  If the condition
> +	 * needs to be changed for them, make another wrapper function.
> +	 */
> +	WARN_ON(!is_shadow_present_pte(*sptep));
> +	return sptep;
>  }
>  
>  /*
> @@ -1119,14 +1130,14 @@ static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
>   */
>  static u64 *rmap_get_next(struct rmap_iterator *iter)
>  {
> +	u64 *sptep;
> +
>  	if (iter->desc) {
>  		if (iter->pos < PTE_LIST_EXT - 1) {
> -			u64 *sptep;
> -
>  			++iter->pos;
>  			sptep = iter->desc->sptes[iter->pos];
>  			if (sptep)
> -				return sptep;
> +				goto out;
>  		}
>  
>  		iter->desc = iter->desc->more;
> @@ -1134,17 +1145,20 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
>  		if (iter->desc) {
>  			iter->pos = 0;
>  			/* desc->sptes[0] cannot be NULL */
> -			return iter->desc->sptes[iter->pos];
> +			sptep = iter->desc->sptes[iter->pos];
> +			goto out;
>  		}
>  	}
>  
>  	return NULL;
> +out:
> +	WARN_ON(!is_shadow_present_pte(*sptep));
> +	return sptep;
>  }
>  
>  #define for_each_rmap_spte(_rmap_, _iter_, _spte_)			    \
>  	   for (_spte_ = rmap_get_first(*_rmap_, _iter_);		    \
> -		_spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});  \
> -			_spte_ = rmap_get_next(_iter_))
> +		_spte_; _spte_ = rmap_get_next(_iter_))
>  
>  static void drop_spte(struct kvm *kvm, u64 *sptep)
>  {
> @@ -1358,7 +1372,6 @@ static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
>  	bool flush = false;
>  
>  	while ((sptep = rmap_get_first(*rmapp, &iter))) {
> -		BUG_ON(!(*sptep & PT_PRESENT_MASK));
>  		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
>  
>  		drop_spte(kvm, sptep);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work
  2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (4 preceding siblings ...)
  2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
@ 2015-11-09 10:15 ` Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-09 10:15 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel



On 06/11/2015 08:20, Takuya Yoshikawa wrote:
> Patch 1/2/3 are easy ones.
> 
> Following two, patch 4/5, may not be ideal solutions, but at least
> explain, or try to explain, the problems.

They are okay!  I replied to patch 5 with a suggestion for further
cleanup.  I'll apply them for 4.5.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-09 10:14   ` Paolo Bonzini
@ 2015-11-10  9:05     ` Takuya Yoshikawa
  2015-11-10  9:18       ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Takuya Yoshikawa @ 2015-11-10  9:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel

On 2015/11/09 19:14, Paolo Bonzini wrote:
> Can you also change kvm_mmu_mark_parents_unsync to use
> for_each_rmap_spte instead of pte_list_walk?  It is the last use of
> pte_list_walk, and it's nice if we have two uses of for_each_rmap_spte
> with parent_ptes as the argument.

No problem, I will do.

Since parent_ptes is also explained as the "reverse mapping" list of
parent sptes (in mmu.txt and kvm_host.h), using rmap helpers will not
look so strange.

> BTW, on my todo list is to change the rmap items to a struct (with a
> single u64 inside) for type safety.  Since you are touching this code,
> perhaps you can give it a shot?

Yes, almost done here (assuming that you mean 'unsigned long').
But I have some candidates for its name in mind:

1. struct kvm_rmap { unsigned long val; };
2. struct kvm_rmap_head { unsigned long val; };
3. struct kvm_rmap_list_head { unsigned long val; };
4. struct kvm_spte_list_head { unsigned long val; };

Since this is the head of the reverse mapping list of sptes, I thought
name 3 might be the best and first made a patch with it, but it was
a bit longer than I had hoped it to be.

I have changed it to name 2, and it looks a bit nicer now, but even
shorter, e.g. name 1, may be good as well.

Do you have any preference?

   Takuya


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes
  2015-11-10  9:05     ` Takuya Yoshikawa
@ 2015-11-10  9:18       ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2015-11-10  9:18 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel



On 10/11/2015 10:05, Takuya Yoshikawa wrote:
> 
> 
>> BTW, on my todo list is to change the rmap items to a struct (with a
>> single u64 inside) for type safety.  Since you are touching this code,
>> perhaps you can give it a shot?
> 
> Yes, almost done here (assuming that you mean 'unsigned long').

Yes, thanks!

> But I have some candidates for its name in mind:
> 
> 1. struct kvm_rmap { unsigned long val; };
> 2. struct kvm_rmap_head { unsigned long val; };
> 3. struct kvm_rmap_list_head { unsigned long val; };
> 4. struct kvm_spte_list_head { unsigned long val; };
> 
> Since this is the head of the reverse mapping list of sptes, I thought
> name 3 might be the best and first made a patch with it, but it was
> a bit longer than I had hoped it to be.
> 
> I have changed it to name 2, and it looks a bit nicer now, but even
> shorter, e.g. name 1, may be good as well.
> 
> Do you have any preference?

I like kvm_rmap_head.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-11-10  9:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06  7:20 [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
2015-11-06  7:21 ` [PATCH 1/5] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
2015-11-06  7:22 ` [PATCH 2/5] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
2015-11-06  7:23 ` [PATCH 3/5] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
2015-11-06  7:24 ` [PATCH 4/5] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
2015-11-06  7:25 ` [PATCH 5/5] KVM: x86: MMU: Consolidate WARN_ON/BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
2015-11-09 10:14   ` Paolo Bonzini
2015-11-10  9:05     ` Takuya Yoshikawa
2015-11-10  9:18       ` Paolo Bonzini
2015-11-09 10:15 ` [PATCH 0/5] KVM: x86: MMU: Clean up x86's mmu code for future work Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).