All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work
@ 2015-11-20  8:40 Takuya Yoshikawa
  2015-11-20  8:41 ` [PATCH 01/10] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct Takuya Yoshikawa
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:40 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

It seems like you all are busy now, so I've made this patch set so that
mechanical and trivial changes come before.

V2->V3:
Patch 01: Rebased and moved here. Updated stale comments.
  We may also want to use a union, inside the struct, to eliminate casting to
  (u64 *) type when spte is in the head in the future.
Patch 02-05: No change.
  About patch 03: There was a comment on the usage of braces for a single line
  else-if statement from Xiao. As I answered, checkpatch did not complain about
  this, and when the corresponding if block has multiple lines, some developers
  prefer/recommend this style. Feel free to modify it if you don't like it.
Patch 06: Changed WARN_ON to BUG_ON as Marcelo suggested.
Patch 07: Removed unnecessary zero-initialization of sp->parent_ptes as Xiao
  suggested.

I think these seven patches are ready for inclusion.

Patch 08-10: No change now, though there were a few comments.
  This patch set is not intended to optimize anything, so these patches try to
  keep the way mark_unsync() gets called as much as possible: the only changes
  are when this gets called for the new parent_pte and when
  mmu_page_add_parent_pte() gets called.

For these three, I'm not sure what we should do now, still RFC?
We can also consider other approaches, e.g. moving link_shadow_page() in the
kvm_get_mmu_page() as Paolo suggested before.

  Takuya

Takuya Yoshikawa (10):
  [01] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct
  [02] KVM: x86: MMU: Remove unused parameter of __direct_map()
  [03] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  [04] KVM: x86: MMU: Make mmu_set_spte() return emulate value
  [05] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  [06] KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes
  [07] KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page()
  [08] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()
  [09] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page()
  [10] KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page()

 Documentation/virtual/kvm/mmu.txt |   4 +-
 arch/x86/include/asm/kvm_host.h   |   8 +-
 arch/x86/kvm/mmu.c                | 370 ++++++++++++++++++--------------------
 arch/x86/kvm/mmu_audit.c          |  15 +-
 arch/x86/kvm/paging_tmpl.h        |  20 +--
 5 files changed, 201 insertions(+), 216 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/10] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
@ 2015-11-20  8:41 ` Takuya Yoshikawa
  2015-11-20  8:42 ` [PATCH 02/10] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:41 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

New struct kvm_rmap_head makes the code type-safe to some extent.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/include/asm/kvm_host.h |   8 +-
 arch/x86/kvm/mmu.c              | 196 ++++++++++++++++++++--------------------
 arch/x86/kvm/mmu_audit.c        |  13 +--
 3 files changed, 113 insertions(+), 104 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f608e17..8140077 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -214,6 +214,10 @@ union kvm_mmu_page_role {
 	};
 };
 
+struct kvm_rmap_head {
+	unsigned long val;
+};
+
 struct kvm_mmu_page {
 	struct list_head link;
 	struct hlist_node hash_link;
@@ -231,7 +235,7 @@ struct kvm_mmu_page {
 	bool unsync;
 	int root_count;          /* Currently serving as active root */
 	unsigned int unsync_children;
-	unsigned long parent_ptes;	/* Reverse mapping for parent_pte */
+	struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
 
 	/* The page is obsolete if mmu_valid_gen != kvm->arch.mmu_valid_gen.  */
 	unsigned long mmu_valid_gen;
@@ -606,7 +610,7 @@ struct kvm_lpage_info {
 };
 
 struct kvm_arch_memory_slot {
-	unsigned long *rmap[KVM_NR_PAGE_SIZES];
+	struct kvm_rmap_head *rmap[KVM_NR_PAGE_SIZES];
 	struct kvm_lpage_info *lpage_info[KVM_NR_PAGE_SIZES - 1];
 };
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 276d2f2..d9a6801 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -909,36 +909,35 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn,
 }
 
 /*
- * Pte mapping structures:
+ * About rmap_head encoding:
  *
- * If pte_list bit zero is zero, then pte_list point to the spte.
- *
- * If pte_list bit zero is one, (then pte_list & ~1) points to a struct
+ * If the bit zero of rmap_head->val is clear, then it points to the only spte
+ * in this rmap chain. Otherwise, (rmap_head->val & ~1) points to a struct
  * pte_list_desc containing more mappings.
- *
- * Returns the number of pte entries before the spte was added or zero if
- * the spte was not added.
- *
+ */
+
+/*
+ * Returns the number of pointers in the rmap chain, not counting the new one.
  */
 static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
-			unsigned long *pte_list)
+			struct kvm_rmap_head *rmap_head)
 {
 	struct pte_list_desc *desc;
 	int i, count = 0;
 
-	if (!*pte_list) {
+	if (!rmap_head->val) {
 		rmap_printk("pte_list_add: %p %llx 0->1\n", spte, *spte);
-		*pte_list = (unsigned long)spte;
-	} else if (!(*pte_list & 1)) {
+		rmap_head->val = (unsigned long)spte;
+	} else if (!(rmap_head->val & 1)) {
 		rmap_printk("pte_list_add: %p %llx 1->many\n", spte, *spte);
 		desc = mmu_alloc_pte_list_desc(vcpu);
-		desc->sptes[0] = (u64 *)*pte_list;
+		desc->sptes[0] = (u64 *)rmap_head->val;
 		desc->sptes[1] = spte;
-		*pte_list = (unsigned long)desc | 1;
+		rmap_head->val = (unsigned long)desc | 1;
 		++count;
 	} else {
 		rmap_printk("pte_list_add: %p %llx many->many\n", spte, *spte);
-		desc = (struct pte_list_desc *)(*pte_list & ~1ul);
+		desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
 		while (desc->sptes[PTE_LIST_EXT-1] && desc->more) {
 			desc = desc->more;
 			count += PTE_LIST_EXT;
@@ -955,8 +954,9 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
 }
 
 static void
-pte_list_desc_remove_entry(unsigned long *pte_list, struct pte_list_desc *desc,
-			   int i, struct pte_list_desc *prev_desc)
+pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
+			   struct pte_list_desc *desc, int i,
+			   struct pte_list_desc *prev_desc)
 {
 	int j;
 
@@ -967,43 +967,43 @@ pte_list_desc_remove_entry(unsigned long *pte_list, struct pte_list_desc *desc,
 	if (j != 0)
 		return;
 	if (!prev_desc && !desc->more)
-		*pte_list = (unsigned long)desc->sptes[0];
+		rmap_head->val = (unsigned long)desc->sptes[0];
 	else
 		if (prev_desc)
 			prev_desc->more = desc->more;
 		else
-			*pte_list = (unsigned long)desc->more | 1;
+			rmap_head->val = (unsigned long)desc->more | 1;
 	mmu_free_pte_list_desc(desc);
 }
 
-static void pte_list_remove(u64 *spte, unsigned long *pte_list)
+static void pte_list_remove(u64 *spte, struct kvm_rmap_head *rmap_head)
 {
 	struct pte_list_desc *desc;
 	struct pte_list_desc *prev_desc;
 	int i;
 
-	if (!*pte_list) {
+	if (!rmap_head->val) {
 		printk(KERN_ERR "pte_list_remove: %p 0->BUG\n", spte);
 		BUG();
-	} else if (!(*pte_list & 1)) {
+	} else if (!(rmap_head->val & 1)) {
 		rmap_printk("pte_list_remove:  %p 1->0\n", spte);
-		if ((u64 *)*pte_list != spte) {
+		if ((u64 *)rmap_head->val != spte) {
 			printk(KERN_ERR "pte_list_remove:  %p 1->BUG\n", spte);
 			BUG();
 		}
-		*pte_list = 0;
+		rmap_head->val = 0;
 	} else {
 		rmap_printk("pte_list_remove:  %p many->many\n", spte);
-		desc = (struct pte_list_desc *)(*pte_list & ~1ul);
+		desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
 		prev_desc = NULL;
 		while (desc) {
-			for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i)
+			for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i) {
 				if (desc->sptes[i] == spte) {
-					pte_list_desc_remove_entry(pte_list,
-							       desc, i,
-							       prev_desc);
+					pte_list_desc_remove_entry(rmap_head,
+							desc, i, prev_desc);
 					return;
 				}
+			}
 			prev_desc = desc;
 			desc = desc->more;
 		}
@@ -1013,18 +1013,18 @@ static void pte_list_remove(u64 *spte, unsigned long *pte_list)
 }
 
 typedef void (*pte_list_walk_fn) (u64 *spte);
-static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
+static void pte_list_walk(struct kvm_rmap_head *rmap_head, pte_list_walk_fn fn)
 {
 	struct pte_list_desc *desc;
 	int i;
 
-	if (!*pte_list)
+	if (!rmap_head->val)
 		return;
 
-	if (!(*pte_list & 1))
-		return fn((u64 *)*pte_list);
+	if (!(rmap_head->val & 1))
+		return fn((u64 *)rmap_head->val);
 
-	desc = (struct pte_list_desc *)(*pte_list & ~1ul);
+	desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
 	while (desc) {
 		for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i)
 			fn(desc->sptes[i]);
@@ -1032,8 +1032,8 @@ static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
 	}
 }
 
-static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
-				    struct kvm_memory_slot *slot)
+static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
+					   struct kvm_memory_slot *slot)
 {
 	unsigned long idx;
 
@@ -1041,10 +1041,8 @@ static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
 	return &slot->arch.rmap[level - PT_PAGE_TABLE_LEVEL][idx];
 }
 
-/*
- * Take gfn and return the reverse mapping to it.
- */
-static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn, struct kvm_mmu_page *sp)
+static struct kvm_rmap_head *gfn_to_rmap(struct kvm *kvm, gfn_t gfn,
+					 struct kvm_mmu_page *sp)
 {
 	struct kvm_memslots *slots;
 	struct kvm_memory_slot *slot;
@@ -1065,24 +1063,24 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
 static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
 	struct kvm_mmu_page *sp;
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 
 	sp = page_header(__pa(spte));
 	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
-	rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp);
-	return pte_list_add(vcpu, spte, rmapp);
+	rmap_head = gfn_to_rmap(vcpu->kvm, gfn, sp);
+	return pte_list_add(vcpu, spte, rmap_head);
 }
 
 static void rmap_remove(struct kvm *kvm, u64 *spte)
 {
 	struct kvm_mmu_page *sp;
 	gfn_t gfn;
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 
 	sp = page_header(__pa(spte));
 	gfn = kvm_mmu_page_get_gfn(sp, spte - sp->spt);
-	rmapp = gfn_to_rmap(kvm, gfn, sp);
-	pte_list_remove(spte, rmapp);
+	rmap_head = gfn_to_rmap(kvm, gfn, sp);
+	pte_list_remove(spte, rmap_head);
 }
 
 /*
@@ -1102,17 +1100,18 @@ struct rmap_iterator {
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
+static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
+			   struct rmap_iterator *iter)
 {
-	if (!rmap)
+	if (!rmap_head->val)
 		return NULL;
 
-	if (!(rmap & 1)) {
+	if (!(rmap_head->val & 1)) {
 		iter->desc = NULL;
-		return (u64 *)rmap;
+		return (u64 *)rmap_head->val;
 	}
 
-	iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
+	iter->desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
 	iter->pos = 0;
 	return iter->desc->sptes[iter->pos];
 }
@@ -1146,10 +1145,10 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
 	return NULL;
 }
 
-#define for_each_rmap_spte(_rmap_, _iter_, _spte_)			    \
-	   for (_spte_ = rmap_get_first(*_rmap_, _iter_);		    \
-		_spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});  \
-			_spte_ = rmap_get_next(_iter_))
+#define for_each_rmap_spte(_rmap_head_, _iter_, _spte_)			\
+	for (_spte_ = rmap_get_first(_rmap_head_, _iter_);		\
+	     _spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});	\
+	     _spte_ = rmap_get_next(_iter_))
 
 static void drop_spte(struct kvm *kvm, u64 *sptep)
 {
@@ -1207,14 +1206,15 @@ static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool pt_protect)
 	return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
+static bool __rmap_write_protect(struct kvm *kvm,
+				 struct kvm_rmap_head *rmap_head,
 				 bool pt_protect)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
 	bool flush = false;
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep)
 		flush |= spte_write_protect(kvm, sptep, pt_protect);
 
 	return flush;
@@ -1231,13 +1231,13 @@ static bool spte_clear_dirty(struct kvm *kvm, u64 *sptep)
 	return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
 	bool flush = false;
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep)
 		flush |= spte_clear_dirty(kvm, sptep);
 
 	return flush;
@@ -1254,13 +1254,13 @@ static bool spte_set_dirty(struct kvm *kvm, u64 *sptep)
 	return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_set_dirty(struct kvm *kvm, unsigned long *rmapp)
+static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
 	bool flush = false;
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep)
 		flush |= spte_set_dirty(kvm, sptep);
 
 	return flush;
@@ -1280,12 +1280,12 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 
 	while (mask) {
-		rmapp = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
-				      PT_PAGE_TABLE_LEVEL, slot);
-		__rmap_write_protect(kvm, rmapp, false);
+		rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+					  PT_PAGE_TABLE_LEVEL, slot);
+		__rmap_write_protect(kvm, rmap_head, false);
 
 		/* clear the first set bit */
 		mask &= mask - 1;
@@ -1305,12 +1305,12 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 
 	while (mask) {
-		rmapp = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
-				      PT_PAGE_TABLE_LEVEL, slot);
-		__rmap_clear_dirty(kvm, rmapp);
+		rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+					  PT_PAGE_TABLE_LEVEL, slot);
+		__rmap_clear_dirty(kvm, rmap_head);
 
 		/* clear the first set bit */
 		mask &= mask - 1;
@@ -1342,27 +1342,27 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 {
 	struct kvm_memory_slot *slot;
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 	int i;
 	bool write_protected = false;
 
 	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 
 	for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
-		rmapp = __gfn_to_rmap(gfn, i, slot);
-		write_protected |= __rmap_write_protect(vcpu->kvm, rmapp, true);
+		rmap_head = __gfn_to_rmap(gfn, i, slot);
+		write_protected |= __rmap_write_protect(vcpu->kvm, rmap_head, true);
 	}
 
 	return write_protected;
 }
 
-static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
+static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
 	bool flush = false;
 
-	while ((sptep = rmap_get_first(*rmapp, &iter))) {
+	while ((sptep = rmap_get_first(rmap_head, &iter))) {
 		BUG_ON(!(*sptep & PT_PRESENT_MASK));
 		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
 
@@ -1373,14 +1373,14 @@ static bool kvm_zap_rmapp(struct kvm *kvm, unsigned long *rmapp)
 	return flush;
 }
 
-static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
+static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			   struct kvm_memory_slot *slot, gfn_t gfn, int level,
 			   unsigned long data)
 {
-	return kvm_zap_rmapp(kvm, rmapp);
+	return kvm_zap_rmapp(kvm, rmap_head);
 }
 
-static int kvm_set_pte_rmapp(struct kvm *kvm, unsigned long *rmapp,
+static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			     struct kvm_memory_slot *slot, gfn_t gfn, int level,
 			     unsigned long data)
 {
@@ -1395,7 +1395,7 @@ static int kvm_set_pte_rmapp(struct kvm *kvm, unsigned long *rmapp,
 	new_pfn = pte_pfn(*ptep);
 
 restart:
-	for_each_rmap_spte(rmapp, &iter, sptep) {
+	for_each_rmap_spte(rmap_head, &iter, sptep) {
 		rmap_printk("kvm_set_pte_rmapp: spte %p %llx gfn %llx (%d)\n",
 			     sptep, *sptep, gfn, level);
 
@@ -1433,11 +1433,11 @@ struct slot_rmap_walk_iterator {
 
 	/* output fields. */
 	gfn_t gfn;
-	unsigned long *rmap;
+	struct kvm_rmap_head *rmap;
 	int level;
 
 	/* private field. */
-	unsigned long *end_rmap;
+	struct kvm_rmap_head *end_rmap;
 };
 
 static void
@@ -1496,7 +1496,7 @@ static int kvm_handle_hva_range(struct kvm *kvm,
 				unsigned long end,
 				unsigned long data,
 				int (*handler)(struct kvm *kvm,
-					       unsigned long *rmapp,
+					       struct kvm_rmap_head *rmap_head,
 					       struct kvm_memory_slot *slot,
 					       gfn_t gfn,
 					       int level,
@@ -1540,7 +1540,8 @@ static int kvm_handle_hva_range(struct kvm *kvm,
 
 static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
 			  unsigned long data,
-			  int (*handler)(struct kvm *kvm, unsigned long *rmapp,
+			  int (*handler)(struct kvm *kvm,
+					 struct kvm_rmap_head *rmap_head,
 					 struct kvm_memory_slot *slot,
 					 gfn_t gfn, int level,
 					 unsigned long data))
@@ -1563,7 +1564,7 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
 	kvm_handle_hva(kvm, hva, (unsigned long)&pte, kvm_set_pte_rmapp);
 }
 
-static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
+static int kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			 struct kvm_memory_slot *slot, gfn_t gfn, int level,
 			 unsigned long data)
 {
@@ -1573,18 +1574,19 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
 
 	BUG_ON(!shadow_accessed_mask);
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep) {
 		if (*sptep & shadow_accessed_mask) {
 			young = 1;
 			clear_bit((ffs(shadow_accessed_mask) - 1),
 				 (unsigned long *)sptep);
 		}
+	}
 
 	trace_kvm_age_page(gfn, level, slot, young);
 	return young;
 }
 
-static int kvm_test_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
+static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			      struct kvm_memory_slot *slot, gfn_t gfn,
 			      int level, unsigned long data)
 {
@@ -1600,11 +1602,12 @@ static int kvm_test_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
 	if (!shadow_accessed_mask)
 		goto out;
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep) {
 		if (*sptep & shadow_accessed_mask) {
 			young = 1;
 			break;
 		}
+	}
 out:
 	return young;
 }
@@ -1613,14 +1616,14 @@ out:
 
 static void rmap_recycle(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 	struct kvm_mmu_page *sp;
 
 	sp = page_header(__pa(spte));
 
-	rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp);
+	rmap_head = gfn_to_rmap(vcpu->kvm, gfn, sp);
 
-	kvm_unmap_rmapp(vcpu->kvm, rmapp, NULL, gfn, sp->role.level, 0);
+	kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, 0);
 	kvm_flush_remote_tlbs(vcpu->kvm);
 }
 
@@ -1737,7 +1740,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	 * this feature. See the comments in kvm_zap_obsolete_pages().
 	 */
 	list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
-	sp->parent_ptes = 0;
+	sp->parent_ptes.val = 0;
 	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 	kvm_mod_used_mmu_pages(vcpu->kvm, +1);
 	return sp;
@@ -2277,7 +2280,7 @@ static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
 	u64 *sptep;
 	struct rmap_iterator iter;
 
-	while ((sptep = rmap_get_first(sp->parent_ptes, &iter)))
+	while ((sptep = rmap_get_first(&sp->parent_ptes, &iter)))
 		drop_parent_pte(sp, sptep);
 }
 
@@ -4492,7 +4495,7 @@ void kvm_mmu_setup(struct kvm_vcpu *vcpu)
 }
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm, unsigned long *rmap);
+typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head *rmap_head);
 
 /* The caller should hold mmu-lock before calling this function. */
 static bool
@@ -4586,9 +4589,10 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 	spin_unlock(&kvm->mmu_lock);
 }
 
-static bool slot_rmap_write_protect(struct kvm *kvm, unsigned long *rmapp)
+static bool slot_rmap_write_protect(struct kvm *kvm,
+				    struct kvm_rmap_head *rmap_head)
 {
-	return __rmap_write_protect(kvm, rmapp, false);
+	return __rmap_write_protect(kvm, rmap_head, false);
 }
 
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
@@ -4624,7 +4628,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 }
 
 static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
-		unsigned long *rmapp)
+					 struct kvm_rmap_head *rmap_head)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -4633,7 +4637,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 	struct kvm_mmu_page *sp;
 
 restart:
-	for_each_rmap_spte(rmapp, &iter, sptep) {
+	for_each_rmap_spte(rmap_head, &iter, sptep) {
 		sp = page_header(__pa(sptep));
 		pfn = spte_to_pfn(*sptep);
 
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index 03d518e..f7b0488 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -129,7 +129,7 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
 static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)
 {
 	static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10);
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 	struct kvm_mmu_page *rev_sp;
 	struct kvm_memslots *slots;
 	struct kvm_memory_slot *slot;
@@ -150,8 +150,8 @@ static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)
 		return;
 	}
 
-	rmapp = __gfn_to_rmap(gfn, rev_sp->role.level, slot);
-	if (!*rmapp) {
+	rmap_head = __gfn_to_rmap(gfn, rev_sp->role.level, slot);
+	if (!rmap_head->val) {
 		if (!__ratelimit(&ratelimit_state))
 			return;
 		audit_printk(kvm, "no rmap for writable spte %llx\n",
@@ -192,7 +192,7 @@ static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
 
 static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
-	unsigned long *rmapp;
+	struct kvm_rmap_head *rmap_head;
 	u64 *sptep;
 	struct rmap_iterator iter;
 	struct kvm_memslots *slots;
@@ -203,13 +203,14 @@ static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
 
 	slots = kvm_memslots_for_spte_role(kvm, sp->role);
 	slot = __gfn_to_memslot(slots, sp->gfn);
-	rmapp = __gfn_to_rmap(sp->gfn, PT_PAGE_TABLE_LEVEL, slot);
+	rmap_head = __gfn_to_rmap(sp->gfn, PT_PAGE_TABLE_LEVEL, slot);
 
-	for_each_rmap_spte(rmapp, &iter, sptep)
+	for_each_rmap_spte(rmap_head, &iter, sptep) {
 		if (is_writable_pte(*sptep))
 			audit_printk(kvm, "shadow page has writable "
 				     "mappings: gfn %llx role %x\n",
 				     sp->gfn, sp->role.word);
+	}
 }
 
 static void audit_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/10] KVM: x86: MMU: Remove unused parameter of __direct_map()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-20  8:41 ` [PATCH 01/10] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct Takuya Yoshikawa
@ 2015-11-20  8:42 ` Takuya Yoshikawa
  2015-11-20  8:43 ` [PATCH 03/10] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:42 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d9a6801..8a1593f 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,9 +2708,8 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
 	__direct_pte_prefetch(vcpu, sp, sptep);
 }
 
-static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
-			int map_writable, int level, gfn_t gfn, pfn_t pfn,
-			bool prefault)
+static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
+			int level, gfn_t gfn, pfn_t pfn, bool prefault)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
@@ -3018,11 +3017,9 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, v, write, map_writable, level, gfn, pfn,
-			 prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
-
 	return r;
 
 out_unlock:
@@ -3531,8 +3528,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	make_mmu_pages_available(vcpu);
 	if (likely(!force_pt_level))
 		transparent_hugepage_adjust(vcpu, &gfn, &pfn, &level);
-	r = __direct_map(vcpu, gpa, write, map_writable,
-			 level, gfn, pfn, prefault);
+	r = __direct_map(vcpu, write, map_writable, level, gfn, pfn, prefault);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 
 	return r;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/10] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
  2015-11-20  8:41 ` [PATCH 01/10] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct Takuya Yoshikawa
  2015-11-20  8:42 ` [PATCH 02/10] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
@ 2015-11-20  8:43 ` Takuya Yoshikawa
  2015-11-20  8:44 ` [PATCH 04/10] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:43 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

Both __mmu_unsync_walk() and mmu_pages_clear_parents() have three line
code which clears a bit in the unsync child bitmap; the former places it
inside a loop block and uses a few goto statements to jump to it.

A new helper function, clear_unsync_child_bit(), makes the code cleaner.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8a1593f..9832bc9 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1809,6 +1809,13 @@ static int mmu_pages_add(struct kvm_mmu_pages *pvec, struct kvm_mmu_page *sp,
 	return (pvec->nr == KVM_PAGE_ARRAY_NR);
 }
 
+static inline void clear_unsync_child_bit(struct kvm_mmu_page *sp, int idx)
+{
+	--sp->unsync_children;
+	WARN_ON((int)sp->unsync_children < 0);
+	__clear_bit(idx, sp->unsync_child_bitmap);
+}
+
 static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 			   struct kvm_mmu_pages *pvec)
 {
@@ -1818,8 +1825,10 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 		struct kvm_mmu_page *child;
 		u64 ent = sp->spt[i];
 
-		if (!is_shadow_present_pte(ent) || is_large_pte(ent))
-			goto clear_child_bitmap;
+		if (!is_shadow_present_pte(ent) || is_large_pte(ent)) {
+			clear_unsync_child_bit(sp, i);
+			continue;
+		}
 
 		child = page_header(ent & PT64_BASE_ADDR_MASK);
 
@@ -1828,28 +1837,21 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
 				return -ENOSPC;
 
 			ret = __mmu_unsync_walk(child, pvec);
-			if (!ret)
-				goto clear_child_bitmap;
-			else if (ret > 0)
+			if (!ret) {
+				clear_unsync_child_bit(sp, i);
+				continue;
+			} else if (ret > 0) {
 				nr_unsync_leaf += ret;
-			else
+			} else
 				return ret;
 		} else if (child->unsync) {
 			nr_unsync_leaf++;
 			if (mmu_pages_add(pvec, child, i))
 				return -ENOSPC;
 		} else
-			 goto clear_child_bitmap;
-
-		continue;
-
-clear_child_bitmap:
-		__clear_bit(i, sp->unsync_child_bitmap);
-		sp->unsync_children--;
-		WARN_ON((int)sp->unsync_children < 0);
+			clear_unsync_child_bit(sp, i);
 	}
 
-
 	return nr_unsync_leaf;
 }
 
@@ -2012,9 +2014,7 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
 		if (!sp)
 			return;
 
-		--sp->unsync_children;
-		WARN_ON((int)sp->unsync_children < 0);
-		__clear_bit(idx, sp->unsync_child_bitmap);
+		clear_unsync_child_bit(sp, idx);
 		level++;
 	} while (level < PT64_ROOT_LEVEL-1 && !sp->unsync_children);
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/10] KVM: x86: MMU: Make mmu_set_spte() return emulate value
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2015-11-20  8:43 ` [PATCH 03/10] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
@ 2015-11-20  8:44 ` Takuya Yoshikawa
  2015-11-20  8:44 ` [PATCH 05/10] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:44 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

mmu_set_spte()'s code is based on the assumption that the emulate
parameter has a valid pointer value if set_spte() returns true and
write_fault is not zero.  In other cases, emulate may be NULL, so a
NULL-check is needed.

Stop passing emulate pointer and make mmu_set_spte() return the emulate
value instead to clean up this complex interface.  Prefetch functions
can just throw away the return value.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c         | 27 ++++++++++++++-------------
 arch/x86/kvm/paging_tmpl.h | 10 +++++-----
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9832bc9..74c120c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2564,13 +2564,13 @@ done:
 	return ret;
 }
 
-static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			 unsigned pte_access, int write_fault, int *emulate,
-			 int level, gfn_t gfn, pfn_t pfn, bool speculative,
-			 bool host_writable)
+static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
+			 int write_fault, int level, gfn_t gfn, pfn_t pfn,
+			 bool speculative, bool host_writable)
 {
 	int was_rmapped = 0;
 	int rmap_count;
+	bool emulate = false;
 
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
@@ -2600,12 +2600,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	if (set_spte(vcpu, sptep, pte_access, level, gfn, pfn, speculative,
 	      true, host_writable)) {
 		if (write_fault)
-			*emulate = 1;
+			emulate = true;
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
 	}
 
-	if (unlikely(is_mmio_spte(*sptep) && emulate))
-		*emulate = 1;
+	if (unlikely(is_mmio_spte(*sptep)))
+		emulate = true;
 
 	pgprintk("%s: setting spte %llx\n", __func__, *sptep);
 	pgprintk("instantiating %s PTE (%s) at %llx (%llx) addr %p\n",
@@ -2624,6 +2624,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	}
 
 	kvm_release_pfn_clean(pfn);
+
+	return emulate;
 }
 
 static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
@@ -2658,9 +2660,8 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 		return -1;
 
 	for (i = 0; i < ret; i++, gfn++, start++)
-		mmu_set_spte(vcpu, start, access, 0, NULL,
-			     sp->role.level, gfn, page_to_pfn(pages[i]),
-			     true, true);
+		mmu_set_spte(vcpu, start, access, 0, sp->role.level, gfn,
+			     page_to_pfn(pages[i]), true, true);
 
 	return 0;
 }
@@ -2721,9 +2722,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
 
 	for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
 		if (iterator.level == level) {
-			mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
-				     write, &emulate, level, gfn, pfn,
-				     prefault, map_writable);
+			emulate = mmu_set_spte(vcpu, iterator.sptep, ACC_ALL,
+					       write, level, gfn, pfn, prefault,
+					       map_writable);
 			direct_pte_prefetch(vcpu, iterator.sptep);
 			++vcpu->stat.pf_fixed;
 			break;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index d8fdc5c..11650ea 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -475,8 +475,8 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	 * we call mmu_set_spte() with host_writable = true because
 	 * pte_prefetch_gfn_to_pfn always gets a writable pfn.
 	 */
-	mmu_set_spte(vcpu, spte, pte_access, 0, NULL, PT_PAGE_TABLE_LEVEL,
-		     gfn, pfn, true, true);
+	mmu_set_spte(vcpu, spte, pte_access, 0, PT_PAGE_TABLE_LEVEL, gfn, pfn,
+		     true, true);
 
 	return true;
 }
@@ -556,7 +556,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	struct kvm_mmu_page *sp = NULL;
 	struct kvm_shadow_walk_iterator it;
 	unsigned direct_access, access = gw->pt_access;
-	int top_level, emulate = 0;
+	int top_level, emulate;
 
 	direct_access = gw->pte_access;
 
@@ -622,8 +622,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	}
 
 	clear_sp_write_flooding_count(it.sptep);
-	mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault, &emulate,
-		     it.level, gw->gfn, pfn, prefault, map_writable);
+	emulate = mmu_set_spte(vcpu, it.sptep, gw->pte_access, write_fault,
+			       it.level, gw->gfn, pfn, prefault, map_writable);
 	FNAME(pte_prefetch)(vcpu, gw, it.sptep);
 
 	return emulate;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/10] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (3 preceding siblings ...)
  2015-11-20  8:44 ` [PATCH 04/10] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
@ 2015-11-20  8:44 ` Takuya Yoshikawa
  2015-11-20  8:45 ` [PATCH 06/10] KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:44 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

is_rmap_spte(), originally named is_rmap_pte(), was introduced when the
simple reverse mapping was implemented by commit cd4a4e5374110444
("[PATCH] KVM: MMU: Implement simple reverse mapping").  At that point,
its role was clear and only rmap_add() and rmap_remove() were using it
to select sptes that need to be reverse-mapped.

Independently of that, is_shadow_present_pte() was first introduced by
commit c7addb902054195b ("KVM: Allow not-present guest page faults to
bypass kvm") to do bypass_guest_pf optimization, which does not exist
any more.

These two seem to have changed their roles somewhat, and is_rmap_spte()
just calls is_shadow_present_pte() now.

Since using both of them without clear distinction just makes the code
confusing, remove is_rmap_spte().

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c       | 13 ++++---------
 arch/x86/kvm/mmu_audit.c |  2 +-
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 74c120c..3104748 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -311,11 +311,6 @@ static int is_large_pte(u64 pte)
 	return pte & PT_PAGE_SIZE_MASK;
 }
 
-static int is_rmap_spte(u64 pte)
-{
-	return is_shadow_present_pte(pte);
-}
-
 static int is_last_spte(u64 pte, int level)
 {
 	if (level == PT_PAGE_TABLE_LEVEL)
@@ -540,7 +535,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
 	u64 old_spte = *sptep;
 	bool ret = false;
 
-	WARN_ON(!is_rmap_spte(new_spte));
+	WARN_ON(!is_shadow_present_pte(new_spte));
 
 	if (!is_shadow_present_pte(old_spte)) {
 		mmu_spte_set(sptep, new_spte);
@@ -595,7 +590,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep)
 	else
 		old_spte = __update_clear_spte_slow(sptep, 0ull);
 
-	if (!is_rmap_spte(old_spte))
+	if (!is_shadow_present_pte(old_spte))
 		return 0;
 
 	pfn = spte_to_pfn(old_spte);
@@ -2575,7 +2570,7 @@ static bool mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
 	pgprintk("%s: spte %llx write_fault %d gfn %llx\n", __func__,
 		 *sptep, write_fault, gfn);
 
-	if (is_rmap_spte(*sptep)) {
+	if (is_shadow_present_pte(*sptep)) {
 		/*
 		 * If we overwrite a PTE page pointer with a 2MB PMD, unlink
 		 * the parent of the now unreachable PTE.
@@ -2919,7 +2914,7 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 	 * If the mapping has been changed, let the vcpu fault on the
 	 * same address again.
 	 */
-	if (!is_rmap_spte(spte)) {
+	if (!is_shadow_present_pte(spte)) {
 		ret = true;
 		goto exit;
 	}
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index f7b0488..1cee3ec 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -183,7 +183,7 @@ static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
 		return;
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-		if (!is_rmap_spte(sp->spt[i]))
+		if (!is_shadow_present_pte(sp->spt[i]))
 			continue;
 
 		inspect_spte_has_rmap(kvm, sp->spt + i);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/10] KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (4 preceding siblings ...)
  2015-11-20  8:44 ` [PATCH 05/10] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
@ 2015-11-20  8:45 ` Takuya Yoshikawa
  2015-11-20  8:46 ` [PATCH 07/10] KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page() Takuya Yoshikawa
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:45 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

At some call sites of rmap_get_first() and rmap_get_next(), BUG_ON is
placed right after the call to detect unrelated sptes which must not be
found in the reverse-mapping list.

Move this check in rmap_get_first/next() so that all call sites, not
just the users of the for_each_rmap_spte() macro, will be checked the
same way.

One thing to keep in mind is that kvm_mmu_unlink_parents() also uses
rmap_get_first() to handle parent sptes.  The change will not break it
because parent sptes are present, at least until drop_parent_pte()
actually unlinks them, and not mmio-sptes.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 Documentation/virtual/kvm/mmu.txt |  4 ++--
 arch/x86/kvm/mmu.c                | 26 +++++++++++++++++---------
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index 3a4d681..daf9c0f 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -203,10 +203,10 @@ Shadow pages contain the following information:
     page cannot be destroyed.  See role.invalid.
   parent_ptes:
     The reverse mapping for the pte/ptes pointing at this page's spt. If
-    parent_ptes bit 0 is zero, only one spte points at this pages and
+    parent_ptes bit 0 is zero, only one spte points at this page and
     parent_ptes points at this single spte, otherwise, there exists multiple
     sptes pointing at this page and (parent_ptes & ~0x1) points at a data
-    structure with a list of parent_ptes.
+    structure with a list of parent sptes.
   unsync:
     If true, then the translations in this page may not match the guest's
     translation.  This is equivalent to the state of the tlb when a pte is
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 3104748..5b249d4 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1098,17 +1098,23 @@ struct rmap_iterator {
 static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
 			   struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (!rmap_head->val)
 		return NULL;
 
 	if (!(rmap_head->val & 1)) {
 		iter->desc = NULL;
-		return (u64 *)rmap_head->val;
+		sptep = (u64 *)rmap_head->val;
+		goto out;
 	}
 
 	iter->desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
 	iter->pos = 0;
-	return iter->desc->sptes[iter->pos];
+	sptep = iter->desc->sptes[iter->pos];
+out:
+	BUG_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 /*
@@ -1118,14 +1124,14 @@ static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
  */
 static u64 *rmap_get_next(struct rmap_iterator *iter)
 {
+	u64 *sptep;
+
 	if (iter->desc) {
 		if (iter->pos < PTE_LIST_EXT - 1) {
-			u64 *sptep;
-
 			++iter->pos;
 			sptep = iter->desc->sptes[iter->pos];
 			if (sptep)
-				return sptep;
+				goto out;
 		}
 
 		iter->desc = iter->desc->more;
@@ -1133,17 +1139,20 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
 		if (iter->desc) {
 			iter->pos = 0;
 			/* desc->sptes[0] cannot be NULL */
-			return iter->desc->sptes[iter->pos];
+			sptep = iter->desc->sptes[iter->pos];
+			goto out;
 		}
 	}
 
 	return NULL;
+out:
+	BUG_ON(!is_shadow_present_pte(*sptep));
+	return sptep;
 }
 
 #define for_each_rmap_spte(_rmap_head_, _iter_, _spte_)			\
 	for (_spte_ = rmap_get_first(_rmap_head_, _iter_);		\
-	     _spte_ && ({BUG_ON(!is_shadow_present_pte(*_spte_)); 1;});	\
-	     _spte_ = rmap_get_next(_iter_))
+	     _spte_; _spte_ = rmap_get_next(_iter_))
 
 static void drop_spte(struct kvm *kvm, u64 *sptep)
 {
@@ -1358,7 +1367,6 @@ static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
 	bool flush = false;
 
 	while ((sptep = rmap_get_first(rmap_head, &iter))) {
-		BUG_ON(!(*sptep & PT_PRESENT_MASK));
 		rmap_printk("%s: spte %p %llx.\n", __func__, sptep, *sptep);
 
 		drop_spte(kvm, sptep);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()
  2015-11-20  8:47 ` [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() Takuya Yoshikawa
@ 2015-11-20  8:46   ` Xiao Guangrong
  2015-11-20  9:08     ` Takuya Yoshikawa
  2015-11-25 16:29   ` Paolo Bonzini
  1 sibling, 1 reply; 19+ messages in thread
From: Xiao Guangrong @ 2015-11-20  8:46 UTC (permalink / raw)
  To: Takuya Yoshikawa, pbonzini; +Cc: kvm, linux-kernel, mtosatti


You just ignored my comment on the previous version...

On 11/20/2015 04:47 PM, Takuya Yoshikawa wrote:
> kvm_mmu_mark_parents_unsync() alone uses pte_list_walk(), witch does
> nearly the same as the for_each_rmap_spte macro.  The only difference
> is that is_shadow_present_pte() checks cannot be placed there because
> kvm_mmu_mark_parents_unsync() can be called with a new parent pointer
> whose entry is not set yet.
>
> By calling mark_unsync() separately for the parent and adding the parent
> pointer to the parent_ptes chain later in kvm_mmu_get_page(), the macro
> works with no problem.
>
> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>   arch/x86/kvm/mmu.c | 36 +++++++++++++-----------------------
>   1 file changed, 13 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 7f46e3e..4e29d9a 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1007,26 +1007,6 @@ static void pte_list_remove(u64 *spte, struct kvm_rmap_head *rmap_head)
>   	}
>   }
>
> -typedef void (*pte_list_walk_fn) (u64 *spte);
> -static void pte_list_walk(struct kvm_rmap_head *rmap_head, pte_list_walk_fn fn)
> -{
> -	struct pte_list_desc *desc;
> -	int i;
> -
> -	if (!rmap_head->val)
> -		return;
> -
> -	if (!(rmap_head->val & 1))
> -		return fn((u64 *)rmap_head->val);
> -
> -	desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
> -	while (desc) {
> -		for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i)
> -			fn(desc->sptes[i]);
> -		desc = desc->more;
> -	}
> -}
> -
>   static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
>   					   struct kvm_memory_slot *slot)
>   {
> @@ -1749,7 +1729,12 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct
>   static void mark_unsync(u64 *spte);
>   static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp)
>   {
> -	pte_list_walk(&sp->parent_ptes, mark_unsync);
> +	u64 *sptep;
> +	struct rmap_iterator iter;
> +
> +	for_each_rmap_spte(&sp->parent_ptes, &iter, sptep) {
> +		mark_unsync(sptep);
> +	}
>   }
>
>   static void mark_unsync(u64 *spte)
> @@ -2119,12 +2104,17 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>   		if (sp->unsync && kvm_sync_page_transient(vcpu, sp))
>   			break;
>
> -		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
>   		if (sp->unsync_children) {
>   			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
>   			kvm_mmu_mark_parents_unsync(sp);
> -		} else if (sp->unsync)
> +			if (parent_pte)
> +				mark_unsync(parent_pte);
> +		} else if (sp->unsync) {
>   			kvm_mmu_mark_parents_unsync(sp);
> +			if (parent_pte)
> +				mark_unsync(parent_pte);
> +		}
> +		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
>
>   		__clear_sp_write_flooding_count(sp);
>   		trace_kvm_mmu_get_page(sp, false);
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 07/10] KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (5 preceding siblings ...)
  2015-11-20  8:45 ` [PATCH 06/10] KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
@ 2015-11-20  8:46 ` Takuya Yoshikawa
  2015-11-20  8:47 ` [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() Takuya Yoshikawa
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:46 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

Make kvm_mmu_alloc_page() do just what its name tells to do, and remove
the extra allocation error check and zero-initialization of parent_ptes:
shadow page headers allocated by kmem_cache_zalloc() are always in the
per-VCPU pools.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 5b249d4..7f46e3e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1726,8 +1726,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp,
 	mmu_spte_clear_no_track(parent_pte);
 }
 
-static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
-					       u64 *parent_pte, int direct)
+static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct)
 {
 	struct kvm_mmu_page *sp;
 
@@ -1743,8 +1742,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	 * this feature. See the comments in kvm_zap_obsolete_pages().
 	 */
 	list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
-	sp->parent_ptes.val = 0;
-	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 	kvm_mod_used_mmu_pages(vcpu->kvm, +1);
 	return sp;
 }
@@ -2133,10 +2130,13 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 		trace_kvm_mmu_get_page(sp, false);
 		return sp;
 	}
+
 	++vcpu->kvm->stat.mmu_cache_miss;
-	sp = kvm_mmu_alloc_page(vcpu, parent_pte, direct);
-	if (!sp)
-		return sp;
+
+	sp = kvm_mmu_alloc_page(vcpu, direct);
+
+	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
+
 	sp->gfn = gfn;
 	sp->role = role;
 	hlist_add_head(&sp->hash_link,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (6 preceding siblings ...)
  2015-11-20  8:46 ` [PATCH 07/10] KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page() Takuya Yoshikawa
@ 2015-11-20  8:47 ` Takuya Yoshikawa
  2015-11-20  8:46   ` Xiao Guangrong
  2015-11-25 16:29   ` Paolo Bonzini
  2015-11-20  8:48 ` [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() Takuya Yoshikawa
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:47 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

kvm_mmu_mark_parents_unsync() alone uses pte_list_walk(), witch does
nearly the same as the for_each_rmap_spte macro.  The only difference
is that is_shadow_present_pte() checks cannot be placed there because
kvm_mmu_mark_parents_unsync() can be called with a new parent pointer
whose entry is not set yet.

By calling mark_unsync() separately for the parent and adding the parent
pointer to the parent_ptes chain later in kvm_mmu_get_page(), the macro
works with no problem.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c | 36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7f46e3e..4e29d9a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1007,26 +1007,6 @@ static void pte_list_remove(u64 *spte, struct kvm_rmap_head *rmap_head)
 	}
 }
 
-typedef void (*pte_list_walk_fn) (u64 *spte);
-static void pte_list_walk(struct kvm_rmap_head *rmap_head, pte_list_walk_fn fn)
-{
-	struct pte_list_desc *desc;
-	int i;
-
-	if (!rmap_head->val)
-		return;
-
-	if (!(rmap_head->val & 1))
-		return fn((u64 *)rmap_head->val);
-
-	desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
-	while (desc) {
-		for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i)
-			fn(desc->sptes[i]);
-		desc = desc->more;
-	}
-}
-
 static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
 					   struct kvm_memory_slot *slot)
 {
@@ -1749,7 +1729,12 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct
 static void mark_unsync(u64 *spte);
 static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp)
 {
-	pte_list_walk(&sp->parent_ptes, mark_unsync);
+	u64 *sptep;
+	struct rmap_iterator iter;
+
+	for_each_rmap_spte(&sp->parent_ptes, &iter, sptep) {
+		mark_unsync(sptep);
+	}
 }
 
 static void mark_unsync(u64 *spte)
@@ -2119,12 +2104,17 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 		if (sp->unsync && kvm_sync_page_transient(vcpu, sp))
 			break;
 
-		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 		if (sp->unsync_children) {
 			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
 			kvm_mmu_mark_parents_unsync(sp);
-		} else if (sp->unsync)
+			if (parent_pte)
+				mark_unsync(parent_pte);
+		} else if (sp->unsync) {
 			kvm_mmu_mark_parents_unsync(sp);
+			if (parent_pte)
+				mark_unsync(parent_pte);
+		}
+		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 
 		__clear_sp_write_flooding_count(sp);
 		trace_kvm_mmu_get_page(sp, false);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (7 preceding siblings ...)
  2015-11-20  8:47 ` [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() Takuya Yoshikawa
@ 2015-11-20  8:48 ` Takuya Yoshikawa
  2015-11-20  8:57   ` Xiao Guangrong
  2015-11-20  8:48 ` [PATCH 10/10] KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page() Takuya Yoshikawa
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:48 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

Every time kvm_mmu_get_page() is called with a non-NULL parent_pte
argument, link_shadow_page() follows that to set the parent entry so
that the new mapping will point to the returned page table.

Moving parent_pte handling there allows to clean up the code because
parent_pte is passed to kvm_mmu_get_page() just for mark_unsync() and
mmu_page_add_parent_pte().

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c         | 22 ++++++++--------------
 arch/x86/kvm/paging_tmpl.h |  6 ++----
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4e29d9a..b020323 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2107,14 +2107,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 		if (sp->unsync_children) {
 			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
 			kvm_mmu_mark_parents_unsync(sp);
-			if (parent_pte)
-				mark_unsync(parent_pte);
 		} else if (sp->unsync) {
 			kvm_mmu_mark_parents_unsync(sp);
-			if (parent_pte)
-				mark_unsync(parent_pte);
 		}
-		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 
 		__clear_sp_write_flooding_count(sp);
 		trace_kvm_mmu_get_page(sp, false);
@@ -2125,8 +2120,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 
 	sp = kvm_mmu_alloc_page(vcpu, direct);
 
-	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
-
 	sp->gfn = gfn;
 	sp->role = role;
 	hlist_add_head(&sp->hash_link,
@@ -2194,7 +2187,8 @@ static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
 	return __shadow_walk_next(iterator, *iterator->sptep);
 }
 
-static void link_shadow_page(u64 *sptep, struct kvm_mmu_page *sp)
+static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
+			     struct kvm_mmu_page *sp)
 {
 	u64 spte;
 
@@ -2205,6 +2199,11 @@ static void link_shadow_page(u64 *sptep, struct kvm_mmu_page *sp)
 	       shadow_user_mask | shadow_x_mask | shadow_accessed_mask;
 
 	mmu_spte_set(sptep, spte);
+
+	if (sp->unsync_children || sp->unsync)
+		mark_unsync(sptep);
+
+	mmu_page_add_parent_pte(vcpu, sp, sptep);
 }
 
 static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
@@ -2263,11 +2262,6 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
 		mmu_page_zap_pte(kvm, sp, sp->spt + i);
 }
 
-static void kvm_mmu_put_page(struct kvm_mmu_page *sp, u64 *parent_pte)
-{
-	mmu_page_remove_parent_pte(sp, parent_pte);
-}
-
 static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	u64 *sptep;
@@ -2733,7 +2727,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
 					      iterator.level - 1,
 					      1, ACC_ALL, iterator.sptep);
 
-			link_shadow_page(iterator.sptep, sp);
+			link_shadow_page(vcpu, iterator.sptep, sp);
 		}
 	}
 	return emulate;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 11650ea..0dcf9c8 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -598,7 +598,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			goto out_gpte_changed;
 
 		if (sp)
-			link_shadow_page(it.sptep, sp);
+			link_shadow_page(vcpu, it.sptep, sp);
 	}
 
 	for (;
@@ -618,7 +618,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 
 		sp = kvm_mmu_get_page(vcpu, direct_gfn, addr, it.level-1,
 				      true, direct_access, it.sptep);
-		link_shadow_page(it.sptep, sp);
+		link_shadow_page(vcpu, it.sptep, sp);
 	}
 
 	clear_sp_write_flooding_count(it.sptep);
@@ -629,8 +629,6 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 	return emulate;
 
 out_gpte_changed:
-	if (sp)
-		kvm_mmu_put_page(sp, it.sptep);
 	kvm_release_pfn_clean(pfn);
 	return 0;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/10] KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page()
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (8 preceding siblings ...)
  2015-11-20  8:48 ` [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() Takuya Yoshikawa
@ 2015-11-20  8:48 ` Takuya Yoshikawa
  2015-11-20  9:35 ` [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Xiao Guangrong
  2015-11-20 16:01 ` Paolo Bonzini
  11 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  8:48 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c         | 20 +++++++-------------
 arch/x86/kvm/paging_tmpl.h |  4 ++--
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b020323..9baf884 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2071,8 +2071,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     gva_t gaddr,
 					     unsigned level,
 					     int direct,
-					     unsigned access,
-					     u64 *parent_pte)
+					     unsigned access)
 {
 	union kvm_mmu_page_role role;
 	unsigned quadrant;
@@ -2724,8 +2723,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
 			base_addr &= PT64_LVL_ADDR_MASK(iterator.level);
 			pseudo_gfn = base_addr >> PAGE_SHIFT;
 			sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr,
-					      iterator.level - 1,
-					      1, ACC_ALL, iterator.sptep);
+					      iterator.level - 1, 1, ACC_ALL);
 
 			link_shadow_page(vcpu, iterator.sptep, sp);
 		}
@@ -3082,8 +3080,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
 		spin_lock(&vcpu->kvm->mmu_lock);
 		make_mmu_pages_available(vcpu);
-		sp = kvm_mmu_get_page(vcpu, 0, 0, PT64_ROOT_LEVEL,
-				      1, ACC_ALL, NULL);
+		sp = kvm_mmu_get_page(vcpu, 0, 0, PT64_ROOT_LEVEL, 1, ACC_ALL);
 		++sp->root_count;
 		spin_unlock(&vcpu->kvm->mmu_lock);
 		vcpu->arch.mmu.root_hpa = __pa(sp->spt);
@@ -3095,9 +3092,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 			spin_lock(&vcpu->kvm->mmu_lock);
 			make_mmu_pages_available(vcpu);
 			sp = kvm_mmu_get_page(vcpu, i << (30 - PAGE_SHIFT),
-					      i << 30,
-					      PT32_ROOT_LEVEL, 1, ACC_ALL,
-					      NULL);
+					i << 30, PT32_ROOT_LEVEL, 1, ACC_ALL);
 			root = __pa(sp->spt);
 			++sp->root_count;
 			spin_unlock(&vcpu->kvm->mmu_lock);
@@ -3134,7 +3129,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		spin_lock(&vcpu->kvm->mmu_lock);
 		make_mmu_pages_available(vcpu);
 		sp = kvm_mmu_get_page(vcpu, root_gfn, 0, PT64_ROOT_LEVEL,
-				      0, ACC_ALL, NULL);
+				      0, ACC_ALL);
 		root = __pa(sp->spt);
 		++sp->root_count;
 		spin_unlock(&vcpu->kvm->mmu_lock);
@@ -3167,9 +3162,8 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		}
 		spin_lock(&vcpu->kvm->mmu_lock);
 		make_mmu_pages_available(vcpu);
-		sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
-				      PT32_ROOT_LEVEL, 0,
-				      ACC_ALL, NULL);
+		sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30, PT32_ROOT_LEVEL,
+				      0, ACC_ALL);
 		root = __pa(sp->spt);
 		++sp->root_count;
 		spin_unlock(&vcpu->kvm->mmu_lock);
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 0dcf9c8..91e939b 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -587,7 +587,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		if (!is_shadow_present_pte(*it.sptep)) {
 			table_gfn = gw->table_gfn[it.level - 2];
 			sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
-					      false, access, it.sptep);
+					      false, access);
 		}
 
 		/*
@@ -617,7 +617,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		direct_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
 
 		sp = kvm_mmu_get_page(vcpu, direct_gfn, addr, it.level-1,
-				      true, direct_access, it.sptep);
+				      true, direct_access);
 		link_shadow_page(vcpu, it.sptep, sp);
 	}
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page()
  2015-11-20  8:48 ` [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() Takuya Yoshikawa
@ 2015-11-20  8:57   ` Xiao Guangrong
  2015-11-25 16:32     ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Xiao Guangrong @ 2015-11-20  8:57 UTC (permalink / raw)
  To: Takuya Yoshikawa, pbonzini; +Cc: kvm, linux-kernel, mtosatti



You can move this patch to the front of
[PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()

By moving kvm_mmu_mark_parents_unsync() to the behind of mmu_spte_set() (then the parent
spte is present now), you can directly clean up for_each_rmap_spte().


On 11/20/2015 04:48 PM, Takuya Yoshikawa wrote:
> Every time kvm_mmu_get_page() is called with a non-NULL parent_pte
> argument, link_shadow_page() follows that to set the parent entry so
> that the new mapping will point to the returned page table.
>
> Moving parent_pte handling there allows to clean up the code because
> parent_pte is passed to kvm_mmu_get_page() just for mark_unsync() and
> mmu_page_add_parent_pte().
>
> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>   arch/x86/kvm/mmu.c         | 22 ++++++++--------------
>   arch/x86/kvm/paging_tmpl.h |  6 ++----
>   2 files changed, 10 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 4e29d9a..b020323 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2107,14 +2107,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>   		if (sp->unsync_children) {
>   			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
>   			kvm_mmu_mark_parents_unsync(sp);
> -			if (parent_pte)
> -				mark_unsync(parent_pte);
>   		} else if (sp->unsync) {
>   			kvm_mmu_mark_parents_unsync(sp);
> -			if (parent_pte)
> -				mark_unsync(parent_pte);
>   		}
> -		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
>
>   		__clear_sp_write_flooding_count(sp);
>   		trace_kvm_mmu_get_page(sp, false);
> @@ -2125,8 +2120,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>
>   	sp = kvm_mmu_alloc_page(vcpu, direct);
>
> -	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
> -
>   	sp->gfn = gfn;
>   	sp->role = role;
>   	hlist_add_head(&sp->hash_link,
> @@ -2194,7 +2187,8 @@ static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
>   	return __shadow_walk_next(iterator, *iterator->sptep);
>   }
>
> -static void link_shadow_page(u64 *sptep, struct kvm_mmu_page *sp)
> +static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
> +			     struct kvm_mmu_page *sp)
>   {
>   	u64 spte;
>
> @@ -2205,6 +2199,11 @@ static void link_shadow_page(u64 *sptep, struct kvm_mmu_page *sp)
>   	       shadow_user_mask | shadow_x_mask | shadow_accessed_mask;
>
>   	mmu_spte_set(sptep, spte);
> +
> +	if (sp->unsync_children || sp->unsync)
> +		mark_unsync(sptep);
> +
> +	mmu_page_add_parent_pte(vcpu, sp, sptep);
>   }
>
>   static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> @@ -2263,11 +2262,6 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
>   		mmu_page_zap_pte(kvm, sp, sp->spt + i);
>   }
>
> -static void kvm_mmu_put_page(struct kvm_mmu_page *sp, u64 *parent_pte)
> -{
> -	mmu_page_remove_parent_pte(sp, parent_pte);
> -}
> -
>   static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
>   {
>   	u64 *sptep;
> @@ -2733,7 +2727,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, int write, int map_writable,
>   					      iterator.level - 1,
>   					      1, ACC_ALL, iterator.sptep);
>
> -			link_shadow_page(iterator.sptep, sp);
> +			link_shadow_page(vcpu, iterator.sptep, sp);
>   		}
>   	}
>   	return emulate;
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> index 11650ea..0dcf9c8 100644
> --- a/arch/x86/kvm/paging_tmpl.h
> +++ b/arch/x86/kvm/paging_tmpl.h
> @@ -598,7 +598,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
>   			goto out_gpte_changed;
>
>   		if (sp)
> -			link_shadow_page(it.sptep, sp);
> +			link_shadow_page(vcpu, it.sptep, sp);
>   	}
>
>   	for (;
> @@ -618,7 +618,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
>
>   		sp = kvm_mmu_get_page(vcpu, direct_gfn, addr, it.level-1,
>   				      true, direct_access, it.sptep);
> -		link_shadow_page(it.sptep, sp);
> +		link_shadow_page(vcpu, it.sptep, sp);
>   	}
>
>   	clear_sp_write_flooding_count(it.sptep);
> @@ -629,8 +629,6 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
>   	return emulate;
>
>   out_gpte_changed:
> -	if (sp)
> -		kvm_mmu_put_page(sp, it.sptep);
>   	kvm_release_pfn_clean(pfn);
>   	return 0;
>   }
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()
  2015-11-20  8:46   ` Xiao Guangrong
@ 2015-11-20  9:08     ` Takuya Yoshikawa
  0 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-20  9:08 UTC (permalink / raw)
  To: Xiao Guangrong, pbonzini; +Cc: kvm, linux-kernel, mtosatti

On 2015/11/20 17:46, Xiao Guangrong wrote:
>
> You just ignored my comment on the previous version...

I'm sorry but please read the explanation in patch 00.
I've read your comments and I'm not ignoring you.

Since this patch set has become huge than expected, I'm sending
this version so that patch 01-07 can be applied first.

For patch 08-10, I think we need to check more because there seems
to be some confusion between us.  You can also read other discussions
between Marcelo, Paolo and me.

Anyway, since these three patches has been placed at the end of the
series now, I hope we can concentrate on them easier than before.

Thanks,
   Takuya


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (9 preceding siblings ...)
  2015-11-20  8:48 ` [PATCH 10/10] KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page() Takuya Yoshikawa
@ 2015-11-20  9:35 ` Xiao Guangrong
  2015-11-20 16:01 ` Paolo Bonzini
  11 siblings, 0 replies; 19+ messages in thread
From: Xiao Guangrong @ 2015-11-20  9:35 UTC (permalink / raw)
  To: Takuya Yoshikawa, pbonzini; +Cc: kvm, linux-kernel, mtosatti



On 11/20/2015 04:40 PM, Takuya Yoshikawa wrote:
> It seems like you all are busy now, so I've made this patch set so that
> mechanical and trivial changes come before.
>
> V2->V3:
> Patch 01: Rebased and moved here. Updated stale comments.
>    We may also want to use a union, inside the struct, to eliminate casting to
>    (u64 *) type when spte is in the head in the future.
> Patch 02-05: No change.
>    About patch 03: There was a comment on the usage of braces for a single line
>    else-if statement from Xiao. As I answered, checkpatch did not complain about
>    this, and when the corresponding if block has multiple lines, some developers
>    prefer/recommend this style. Feel free to modify it if you don't like it.
> Patch 06: Changed WARN_ON to BUG_ON as Marcelo suggested.
> Patch 07: Removed unnecessary zero-initialization of sp->parent_ptes as Xiao
>    suggested.
>

For patch 01 - patch 07:

Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work
  2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
                   ` (10 preceding siblings ...)
  2015-11-20  9:35 ` [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Xiao Guangrong
@ 2015-11-20 16:01 ` Paolo Bonzini
  11 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-11-20 16:01 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao



On 20/11/2015 09:40, Takuya Yoshikawa wrote:
>   About patch 03: There was a comment on the usage of braces for a single line
>   else-if statement from Xiao. As I answered, checkpatch did not complain about
>   this, and when the corresponding if block has multiple lines, some developers
>   prefer/recommend this style. Feel free to modify it if you don't like it.

I wonder if we can remove the "else if" completely.  Will take a look
after applying, early next week.

> For these three, I'm not sure what we should do now, still RFC?
> We can also consider other approaches, e.g. moving link_shadow_page() in the
> kvm_get_mmu_page() as Paolo suggested before.

I think I will include them in kvm/next.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk()
  2015-11-20  8:47 ` [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() Takuya Yoshikawa
  2015-11-20  8:46   ` Xiao Guangrong
@ 2015-11-25 16:29   ` Paolo Bonzini
  1 sibling, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-11-25 16:29 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: kvm, linux-kernel, mtosatti, guangrong.xiao



On 20/11/2015 09:47, Takuya Yoshikawa wrote:
> kvm_mmu_mark_parents_unsync() alone uses pte_list_walk(), witch does
> nearly the same as the for_each_rmap_spte macro.  The only difference
> is that is_shadow_present_pte() checks cannot be placed there because
> kvm_mmu_mark_parents_unsync() can be called with a new parent pointer
> whose entry is not set yet.
> 
> By calling mark_unsync() separately for the parent and adding the parent
> pointer to the parent_ptes chain later in kvm_mmu_get_page(), the macro
> works with no problem.
> 
> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>  arch/x86/kvm/mmu.c | 36 +++++++++++++-----------------------
>  1 file changed, 13 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 7f46e3e..4e29d9a 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1007,26 +1007,6 @@ static void pte_list_remove(u64 *spte, struct kvm_rmap_head *rmap_head)
>  	}
>  }
>  
> -typedef void (*pte_list_walk_fn) (u64 *spte);
> -static void pte_list_walk(struct kvm_rmap_head *rmap_head, pte_list_walk_fn fn)
> -{
> -	struct pte_list_desc *desc;
> -	int i;
> -
> -	if (!rmap_head->val)
> -		return;
> -
> -	if (!(rmap_head->val & 1))
> -		return fn((u64 *)rmap_head->val);
> -
> -	desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
> -	while (desc) {
> -		for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i)
> -			fn(desc->sptes[i]);
> -		desc = desc->more;
> -	}
> -}
> -
>  static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
>  					   struct kvm_memory_slot *slot)
>  {
> @@ -1749,7 +1729,12 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct
>  static void mark_unsync(u64 *spte);
>  static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp)
>  {
> -	pte_list_walk(&sp->parent_ptes, mark_unsync);
> +	u64 *sptep;
> +	struct rmap_iterator iter;
> +
> +	for_each_rmap_spte(&sp->parent_ptes, &iter, sptep) {
> +		mark_unsync(sptep);
> +	}
>  }
>  
>  static void mark_unsync(u64 *spte)
> @@ -2119,12 +2104,17 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>  		if (sp->unsync && kvm_sync_page_transient(vcpu, sp))
>  			break;
>  
> -		mmu_page_add_parent_pte(vcpu, sp, parent_pte);
>  		if (sp->unsync_children) {
>  			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
>  			kvm_mmu_mark_parents_unsync(sp);
> -		} else if (sp->unsync)
> +			if (parent_pte)
> +				mark_unsync(parent_pte);
> +		} else if (sp->unsync) {
>  			kvm_mmu_mark_parents_unsync(sp);
> +			if (parent_pte)
> +				mark_unsync(parent_pte);
> +		}
> +		mmu_page_add_parent_pte(vcpu, sp, parent_pte);

This patch is okay with Xiao's suggestion to remove the
kvm_mmu_mark_parents_unsync call.

Paolo

>  		__clear_sp_write_flooding_count(sp);
>  		trace_kvm_mmu_get_page(sp, false);
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page()
  2015-11-20  8:57   ` Xiao Guangrong
@ 2015-11-25 16:32     ` Paolo Bonzini
  2015-11-26  2:36       ` Takuya Yoshikawa
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2015-11-25 16:32 UTC (permalink / raw)
  To: Xiao Guangrong, Takuya Yoshikawa; +Cc: kvm, linux-kernel, mtosatti



On 20/11/2015 09:57, Xiao Guangrong wrote:
> 
> 
> You can move this patch to the front of
> [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of
> pte_list_walk()
> 
> By moving kvm_mmu_mark_parents_unsync() to the behind of mmu_spte_set()
> (then the parent
> spte is present now), you can directly clean up for_each_rmap_spte().

So basically squash together the two patches (8/10 and 9/10) except the
change to kvm_mmu_mark_parents_unsync; then in the second patch switch
from pte_list_walk to for_each_rmap_spte.

That makes sense indeed.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page()
  2015-11-25 16:32     ` Paolo Bonzini
@ 2015-11-26  2:36       ` Takuya Yoshikawa
  0 siblings, 0 replies; 19+ messages in thread
From: Takuya Yoshikawa @ 2015-11-26  2:36 UTC (permalink / raw)
  To: Paolo Bonzini, Xiao Guangrong; +Cc: kvm, linux-kernel, mtosatti

On 2015/11/26 1:32, Paolo Bonzini wrote:
> On 20/11/2015 09:57, Xiao Guangrong wrote:

>> You can move this patch to the front of
>> [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of
>> pte_list_walk()
>>
>> By moving kvm_mmu_mark_parents_unsync() to the behind of mmu_spte_set()
>> (then the parent
>> spte is present now), you can directly clean up for_each_rmap_spte().
>
> So basically squash together the two patches (8/10 and 9/10) except the
> change to kvm_mmu_mark_parents_unsync; then in the second patch switch
> from pte_list_walk to for_each_rmap_spte.
>
> That makes sense indeed.

Sorry for my being late to respond to Xiao's suggestions.  I could not
use my development machine for a while this week.

In short, this kvm_mmu_mark_parents_unsync() call in kvm_mmu_get_page()
should have been mark_unsync() for the new parent_pte only, because we
are constructing the mappings from/to it and other parents in the
sp->parent_ptes are not related to this fault?

As the code has been this way for some time, a bit scary to change it,
but I'll do some tests without that extra kvm_mmu_mark_parents_unsync()
with a guest (with ept=0) this afternoon.

   Takuya



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-11-26  2:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20  8:40 [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Takuya Yoshikawa
2015-11-20  8:41 ` [PATCH 01/10] KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct Takuya Yoshikawa
2015-11-20  8:42 ` [PATCH 02/10] KVM: x86: MMU: Remove unused parameter of __direct_map() Takuya Yoshikawa
2015-11-20  8:43 ` [PATCH 03/10] KVM: x86: MMU: Add helper function to clear a bit in unsync child bitmap Takuya Yoshikawa
2015-11-20  8:44 ` [PATCH 04/10] KVM: x86: MMU: Make mmu_set_spte() return emulate value Takuya Yoshikawa
2015-11-20  8:44 ` [PATCH 05/10] KVM: x86: MMU: Remove is_rmap_spte() and use is_shadow_present_pte() Takuya Yoshikawa
2015-11-20  8:45 ` [PATCH 06/10] KVM: x86: MMU: Consolidate BUG_ON checks for reverse-mapped sptes Takuya Yoshikawa
2015-11-20  8:46 ` [PATCH 07/10] KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page() Takuya Yoshikawa
2015-11-20  8:47 ` [PATCH 08/10] KVM: x86: MMU: Use for_each_rmap_spte macro instead of pte_list_walk() Takuya Yoshikawa
2015-11-20  8:46   ` Xiao Guangrong
2015-11-20  9:08     ` Takuya Yoshikawa
2015-11-25 16:29   ` Paolo Bonzini
2015-11-20  8:48 ` [PATCH 09/10] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() Takuya Yoshikawa
2015-11-20  8:57   ` Xiao Guangrong
2015-11-25 16:32     ` Paolo Bonzini
2015-11-26  2:36       ` Takuya Yoshikawa
2015-11-20  8:48 ` [PATCH 10/10] KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page() Takuya Yoshikawa
2015-11-20  9:35 ` [PATCH V3 00/10] KVM: x86: MMU: Clean up x86's mmu code for future work Xiao Guangrong
2015-11-20 16:01 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.