kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch v3 0/7] Optimize clear dirty log
@ 2023-02-11  1:46 Vipin Sharma
  2023-02-11  1:46 ` [Patch v3 1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Vipin Sharma
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Hi,


This patch series has optimized control flow of clearing dirty log and
improved its performance by ~40% (2% more than v2).

It also got rid of many variants of the handle_changed_spte family of
functions and converged logic to one handle_changed_spte() function. It
also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
booleans for controlling them.

Thanks,
Vipin

v3:
- Tried to do better job at writing commit messages.
- Made kvm_tdp_mmu_clear_spte_bits() similar to the kvm_tdp_mmu_write_spte().
- clear_dirty_pt_masked() evaluates mask for the bit to be cleared outside the
  loop and use that for all of the SPTEs instead of calculating for each SPTE.
- Some naming changes based on the feedbacks.
- Split out the dead code clean from the optimization code.


v2: https://lore.kernel.org/lkml/20230203192822.106773-1-vipinsh@google.com/
- Clear dirty log and age gfn range does not go through
  handle_changed_spte, they handle their SPTE changes locally to improve
  their speed.
- Clear only specific bits atomically when updating SPTEs in clearing
  dirty log and aging gfn range functions.
- Removed tdp_mmu_set_spte_no_[acc_track|dirty_log] APIs.
- Converged all handle_changed_spte related functions to one place.

v1: https://lore.kernel.org/lkml/20230125213857.824959-1-vipinsh@google.com/


Vipin Sharma (7):
  KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic
    write
  KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log
    flow
  KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Optimize SPTE change for aging gfn range
  KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  KVM: x86/mmu: Merge all handle_changed_pte* functions.

 arch/x86/kvm/mmu/tdp_iter.h |  48 ++++++---
 arch/x86/kvm/mmu/tdp_mmu.c  | 190 ++++++++++++------------------------
 2 files changed, 96 insertions(+), 142 deletions(-)

-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch v3 1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-11  1:46 ` [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Vipin Sharma
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Move conditions in kvm_tdp_mmu_write_spte() to check if an SPTE should
be written atomically or not to a separate function.

This new function, kvm_tdp_mmu_spte_need_atomic_write(),  will be used
in future commits to optimize clearing bits in SPTEs.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/tdp_iter.h | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index f0af385c56e0..c11c5d00b2c1 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -29,23 +29,29 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte)
 	WRITE_ONCE(*rcu_dereference(sptep), new_spte);
 }
 
+/*
+ * SPTEs must be modified atomically if they are shadow-present, leaf
+ * SPTEs, and have volatile bits, i.e. has bits that can be set outside
+ * of mmu_lock.  The Writable bit can be set by KVM's fast page fault
+ * handler, and Accessed and Dirty bits can be set by the CPU.
+ *
+ * Note, non-leaf SPTEs do have Accessed bits and those bits are
+ * technically volatile, but KVM doesn't consume the Accessed bit of
+ * non-leaf SPTEs, i.e. KVM doesn't care if it clobbers the bit.  This
+ * logic needs to be reassessed if KVM were to use non-leaf Accessed
+ * bits, e.g. to skip stepping down into child SPTEs when aging SPTEs.
+ */
+static inline bool kvm_tdp_mmu_spte_need_atomic_write(u64 old_spte, int level)
+{
+	return is_shadow_present_pte(old_spte) &&
+	       is_last_spte(old_spte, level) &&
+	       spte_has_volatile_bits(old_spte);
+}
+
 static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
 					 u64 new_spte, int level)
 {
-	/*
-	 * Atomically write the SPTE if it is a shadow-present, leaf SPTE with
-	 * volatile bits, i.e. has bits that can be set outside of mmu_lock.
-	 * The Writable bit can be set by KVM's fast page fault handler, and
-	 * Accessed and Dirty bits can be set by the CPU.
-	 *
-	 * Note, non-leaf SPTEs do have Accessed bits and those bits are
-	 * technically volatile, but KVM doesn't consume the Accessed bit of
-	 * non-leaf SPTEs, i.e. KVM doesn't care if it clobbers the bit.  This
-	 * logic needs to be reassessed if KVM were to use non-leaf Accessed
-	 * bits, e.g. to skip stepping down into child SPTEs when aging SPTEs.
-	 */
-	if (is_shadow_present_pte(old_spte) && is_last_spte(old_spte, level) &&
-	    spte_has_volatile_bits(old_spte))
+	if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level))
 		return kvm_tdp_mmu_write_spte_atomic(sptep, new_spte);
 
 	__kvm_tdp_mmu_write_spte(sptep, new_spte);
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
  2023-02-11  1:46 ` [Patch v3 1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-15 21:12   ` David Matlack
  2023-03-17 22:59   ` Sean Christopherson
  2023-02-11  1:46 ` [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Vipin Sharma
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Do atomic-AND to clear the dirty state of SPTEs. Optimize clear-dirty-log
flow by avoiding to go through __handle_changed_spte() and directly call
kvm_set_pfn_dirty() instead.

Atomic-AND allows to fetch the latest value in SPTE, clear only its
dirty state and set the new SPTE value.  This optimization avoids
executing unnecessary checks by not calling __handle_changed_spte().

With the removal of tdp_mmu_set_spte_no_dirty_log(), "record_dirty_log"
parameter in __tdp_mmu_set_spte() is now obsolete. It will always be set
to true by its caller. This dead code will be cleaned up in future
commits.

Tested on a VM (160 vCPUs, 160 GB memory) and found that performance of clear
dirty log stage improved by ~40% in dirty_log_perf_test

Before optimization:
--------------------
Iteration 1 clear dirty log time: 3.638543593s
Iteration 2 clear dirty log time: 3.145032742s
Iteration 3 clear dirty log time: 3.142340358s
Clear dirty log over 3 iterations took 9.925916693s. (Avg 3.308638897s/iteration)

After optimization:
-------------------
Iteration 1 clear dirty log time: 2.318988110s
Iteration 2 clear dirty log time: 1.794470164s
Iteration 3 clear dirty log time: 1.791668628s
Clear dirty log over 3 iterations took 5.905126902s. (Avg 1.968375634s/iteration)

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/tdp_iter.h | 14 ++++++++++++++
 arch/x86/kvm/mmu/tdp_mmu.c  | 35 +++++++++++++++--------------------
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index c11c5d00b2c1..fae559559a80 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -58,6 +58,20 @@ static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
 	return old_spte;
 }
 
+static inline u64 tdp_mmu_clear_spte_bits(tdp_ptep_t sptep, u64 old_spte,
+					  u64 mask, int level)
+{
+	atomic64_t *sptep_atomic;
+
+	if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) {
+		sptep_atomic = (atomic64_t *)rcu_dereference(sptep);
+		return (u64)atomic64_fetch_and(~mask, sptep_atomic);
+	}
+
+	__kvm_tdp_mmu_write_spte(sptep, old_spte & ~mask);
+	return old_spte;
+}
+
 /*
  * A TDP iterator performs a pre-order walk over a TDP paging structure.
  */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index bba33aea0fb0..66ccbeb9d845 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -771,13 +771,6 @@ static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
 	_tdp_mmu_set_spte(kvm, iter, new_spte, false, true);
 }
 
-static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
-						 struct tdp_iter *iter,
-						 u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true, false);
-}
-
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
 	for_each_tdp_pte(_iter, _root, _start, _end)
 
@@ -1677,8 +1670,13 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
 static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 				  gfn_t gfn, unsigned long mask, bool wrprot)
 {
+	/*
+	 * Either all SPTEs in TDP MMU will need write protection or none. This
+	 * contract will not be modified for TDP MMU pages.
+	 */
+	u64 clear_bit = (wrprot || !kvm_ad_enabled()) ? PT_WRITABLE_MASK :
+							shadow_dirty_mask;
 	struct tdp_iter iter;
-	u64 new_spte;
 
 	rcu_read_lock();
 
@@ -1693,19 +1691,16 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		mask &= ~(1UL << (iter.gfn - gfn));
 
-		if (wrprot || spte_ad_need_write_protect(iter.old_spte)) {
-			if (is_writable_pte(iter.old_spte))
-				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
-			else
-				continue;
-		} else {
-			if (iter.old_spte & shadow_dirty_mask)
-				new_spte = iter.old_spte & ~shadow_dirty_mask;
-			else
-				continue;
-		}
+		if (!(iter.old_spte & clear_bit))
+			continue;
 
-		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
+		iter.old_spte = tdp_mmu_clear_spte_bits(iter.sptep,
+							iter.old_spte,
+							clear_bit, iter.level);
+		trace_kvm_tdp_mmu_spte_changed(iter.as_id, iter.gfn, iter.level,
+					       iter.old_spte,
+					       iter.old_spte & ~clear_bit);
+		kvm_set_pfn_dirty(spte_to_pfn(iter.old_spte));
 	}
 
 	rcu_read_unlock();
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
  2023-02-11  1:46 ` [Patch v3 1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Vipin Sharma
  2023-02-11  1:46 ` [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-15 21:10   ` David Matlack
  2023-02-11  1:46 ` [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range Vipin Sharma
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Remove bool parameter "record_dirty_log" from __tdp_mmu_set_spte() and
refactor the code as this variable is always set to true by its caller.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 66ccbeb9d845..c895560244de 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -710,18 +710,13 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
  *		      notifier for access tracking. Leaving record_acc_track
  *		      unset in that case prevents page accesses from being
  *		      double counted.
- * @record_dirty_log: Record the page as dirty in the dirty bitmap if
- *		      appropriate for the change being made. Should be set
- *		      unless performing certain dirty logging operations.
- *		      Leaving record_dirty_log unset in that case prevents page
- *		      writes from being double counted.
  *
  * Returns the old SPTE value, which _may_ be different than @old_spte if the
  * SPTE had voldatile bits.
  */
 static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 			      u64 old_spte, u64 new_spte, gfn_t gfn, int level,
-			      bool record_acc_track, bool record_dirty_log)
+			      bool record_acc_track)
 {
 	lockdep_assert_held_write(&kvm->mmu_lock);
 
@@ -740,35 +735,34 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	if (record_acc_track)
 		handle_changed_spte_acc_track(old_spte, new_spte, level);
-	if (record_dirty_log)
-		handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
-					      new_spte, level);
+
+	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
+				      level);
 	return old_spte;
 }
 
 static inline void _tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				     u64 new_spte, bool record_acc_track,
-				     bool record_dirty_log)
+				     u64 new_spte, bool record_acc_track)
 {
 	WARN_ON_ONCE(iter->yielded);
 
 	iter->old_spte = __tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
 					    iter->old_spte, new_spte,
 					    iter->gfn, iter->level,
-					    record_acc_track, record_dirty_log);
+					    record_acc_track);
 }
 
 static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
 				    u64 new_spte)
 {
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true, true);
+	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
 }
 
 static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
 						 struct tdp_iter *iter,
 						 u64 new_spte)
 {
-	_tdp_mmu_set_spte(kvm, iter, new_spte, false, true);
+	_tdp_mmu_set_spte(kvm, iter, new_spte, false);
 }
 
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
@@ -918,7 +912,7 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 		return false;
 
 	__tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
-			   sp->gfn, sp->role.level + 1, true, true);
+			   sp->gfn, sp->role.level + 1, true);
 
 	return true;
 }
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
                   ` (2 preceding siblings ...)
  2023-02-11  1:46 ` [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-15 21:15   ` David Matlack
  2023-03-21  0:51   ` Sean Christopherson
  2023-02-11  1:46 ` [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Vipin Sharma
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

No need to check all of the conditions in __handle_changed_spte(). Aging
a gfn range implies resetting access bit or marking spte for access
tracking.

Use atomic operation to only reset those bits. This avoids checking many
conditions in __handle_changed_spte() API. Also, clean up code by
removing dead code and API parameters.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index c895560244de..5d6e77554797 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -758,13 +758,6 @@ static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
 	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
 }
 
-static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
-						 struct tdp_iter *iter,
-						 u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, false);
-}
-
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
 	for_each_tdp_pte(_iter, _root, _start, _end)
 
@@ -1251,32 +1244,41 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
 /*
  * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero
  * if any of the GFNs in the range have been accessed.
+ *
+ * No need to mark corresponding PFN as accessed as this call is coming from
+ * the clear_young() or clear_flush_young() notifier, which uses the return
+ * value to determine if the page has been accessed.
  */
 static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
 			  struct kvm_gfn_range *range)
 {
-	u64 new_spte = 0;
+	u64 new_spte;
 
 	/* If we have a non-accessed entry we don't need to change the pte. */
 	if (!is_accessed_spte(iter->old_spte))
 		return false;
 
-	new_spte = iter->old_spte;
-
-	if (spte_ad_enabled(new_spte)) {
-		new_spte &= ~shadow_accessed_mask;
+	if (spte_ad_enabled(iter->old_spte)) {
+		iter->old_spte  = tdp_mmu_clear_spte_bits(iter->sptep,
+							  iter->old_spte,
+							  shadow_accessed_mask,
+							  iter->level);
+		new_spte = iter->old_spte & ~shadow_accessed_mask;
 	} else {
+		new_spte = mark_spte_for_access_track(iter->old_spte);
+		iter->old_spte = kvm_tdp_mmu_write_spte(iter->sptep,
+							iter->old_spte, new_spte,
+							iter->level);
 		/*
 		 * Capture the dirty status of the page, so that it doesn't get
 		 * lost when the SPTE is marked for access tracking.
 		 */
-		if (is_writable_pte(new_spte))
-			kvm_set_pfn_dirty(spte_to_pfn(new_spte));
-
-		new_spte = mark_spte_for_access_track(new_spte);
+		if (is_writable_pte(iter->old_spte))
+			kvm_set_pfn_dirty(spte_to_pfn(iter->old_spte));
 	}
 
-	tdp_mmu_set_spte_no_acc_track(kvm, iter, new_spte);
+	trace_kvm_tdp_mmu_spte_changed(iter->as_id, iter->gfn, iter->level,
+				       iter->old_spte, new_spte);
 
 	return true;
 }
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
                   ` (3 preceding siblings ...)
  2023-02-11  1:46 ` [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-22 19:31   ` David Matlack
  2023-02-11  1:46 ` [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Vipin Sharma
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Remove bool parameter "record_acc_track" from __tdp_mmu_set_spte() and
refactor the code. This variable is always set to true by its caller.

Remove single and double underscore prefix from tdp_mmu_set_spte()
related APIs:
1. Change __tdp_mmu_set_spte() to tdp_mmu_set_spte()
2. Change _tdp_mmu_set_spte() to tdp_mmu_iter_set_spte()

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 51 +++++++++++++-------------------------
 1 file changed, 17 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 5d6e77554797..e50e869bf879 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -697,7 +697,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
 
 
 /*
- * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
+ * tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
  * @kvm:	      KVM instance
  * @as_id:	      Address space ID, i.e. regular vs. SMM
  * @sptep:	      Pointer to the SPTE
@@ -705,18 +705,12 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
  * @new_spte:	      The new value that will be set for the SPTE
  * @gfn:	      The base GFN that was (or will be) mapped by the SPTE
  * @level:	      The level _containing_ the SPTE (its parent PT's level)
- * @record_acc_track: Notify the MM subsystem of changes to the accessed state
- *		      of the page. Should be set unless handling an MMU
- *		      notifier for access tracking. Leaving record_acc_track
- *		      unset in that case prevents page accesses from being
- *		      double counted.
  *
  * Returns the old SPTE value, which _may_ be different than @old_spte if the
  * SPTE had voldatile bits.
  */
-static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
-			      u64 old_spte, u64 new_spte, gfn_t gfn, int level,
-			      bool record_acc_track)
+static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
+			    u64 old_spte, u64 new_spte, gfn_t gfn, int level)
 {
 	lockdep_assert_held_write(&kvm->mmu_lock);
 
@@ -732,30 +726,19 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 	old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
 
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
-
-	if (record_acc_track)
-		handle_changed_spte_acc_track(old_spte, new_spte, level);
-
+	handle_changed_spte_acc_track(old_spte, new_spte, level);
 	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
 				      level);
 	return old_spte;
 }
 
-static inline void _tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				     u64 new_spte, bool record_acc_track)
+static inline void tdp_mmu_iter_set_spte(struct kvm *kvm, struct tdp_iter *iter,
+					 u64 new_spte)
 {
 	WARN_ON_ONCE(iter->yielded);
-
-	iter->old_spte = __tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
-					    iter->old_spte, new_spte,
-					    iter->gfn, iter->level,
-					    record_acc_track);
-}
-
-static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				    u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
+	iter->old_spte = tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
+					  iter->old_spte, new_spte,
+					  iter->gfn, iter->level);
 }
 
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
@@ -847,7 +830,7 @@ static void __tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
 			continue;
 
 		if (!shared)
-			tdp_mmu_set_spte(kvm, &iter, 0);
+			tdp_mmu_iter_set_spte(kvm, &iter, 0);
 		else if (tdp_mmu_set_spte_atomic(kvm, &iter, 0))
 			goto retry;
 	}
@@ -904,8 +887,8 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 	if (WARN_ON_ONCE(!is_shadow_present_pte(old_spte)))
 		return false;
 
-	__tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
-			   sp->gfn, sp->role.level + 1, true);
+	tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
+			 sp->gfn, sp->role.level + 1);
 
 	return true;
 }
@@ -939,7 +922,7 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root,
 		    !is_last_spte(iter.old_spte, iter.level))
 			continue;
 
-		tdp_mmu_set_spte(kvm, &iter, 0);
+		tdp_mmu_iter_set_spte(kvm, &iter, 0);
 		flush = true;
 	}
 
@@ -1110,7 +1093,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter,
 		if (ret)
 			return ret;
 	} else {
-		tdp_mmu_set_spte(kvm, iter, spte);
+		tdp_mmu_iter_set_spte(kvm, iter, spte);
 	}
 
 	tdp_account_mmu_page(kvm, sp);
@@ -1317,13 +1300,13 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
 	 * invariant that the PFN of a present * leaf SPTE can never change.
 	 * See __handle_changed_spte().
 	 */
-	tdp_mmu_set_spte(kvm, iter, 0);
+	tdp_mmu_iter_set_spte(kvm, iter, 0);
 
 	if (!pte_write(range->pte)) {
 		new_spte = kvm_mmu_changed_pte_notifier_make_spte(iter->old_spte,
 								  pte_pfn(range->pte));
 
-		tdp_mmu_set_spte(kvm, iter, new_spte);
+		tdp_mmu_iter_set_spte(kvm, iter, new_spte);
 	}
 
 	return true;
@@ -1814,7 +1797,7 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root,
 		if (new_spte == iter.old_spte)
 			break;
 
-		tdp_mmu_set_spte(kvm, &iter, new_spte);
+		tdp_mmu_iter_set_spte(kvm, &iter, new_spte);
 		spte_set = true;
 	}
 
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
                   ` (4 preceding siblings ...)
  2023-02-11  1:46 ` [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-22 19:36   ` David Matlack
  2023-02-11  1:46 ` [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions Vipin Sharma
  2023-03-17 22:57 ` [Patch v3 0/7] Optimize clear dirty log Sean Christopherson
  7 siblings, 1 reply; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

Remove handle_changed_spte_dirty_log() as there is no code flow which
sets 4KiB SPTE writable and hit this path. This function marks the page
dirty in a memslot only if new SPTE is 4KiB in size and writable.

Current users of handle_changed_spte_dirty_log() are:
1. set_spte_gfn() - Create only non writable SPTEs.
2. write_protect_gfn() - Change an SPTE to non writable.
3. zap leaf and roots APIs - Everything is 0.
4. handle_removed_pt() - Sets SPTEs to REMOVED_SPTE
5. tdp_mmu_link_sp() - Makes non leaf SPTEs.

There is also no path which creates a writable 4KiB without going
through make_spte() and this functions takes care of marking SPTE dirty
in the memslot if it is PT_WRITABLE.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 22 ----------------------
 1 file changed, 22 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index e50e869bf879..0c031319e901 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -345,24 +345,6 @@ static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
 		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
 }
 
-static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
-					  u64 old_spte, u64 new_spte, int level)
-{
-	bool pfn_changed;
-	struct kvm_memory_slot *slot;
-
-	if (level > PG_LEVEL_4K)
-		return;
-
-	pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
-
-	if ((!is_writable_pte(old_spte) || pfn_changed) &&
-	    is_writable_pte(new_spte)) {
-		slot = __gfn_to_memslot(__kvm_memslots(kvm, as_id), gfn);
-		mark_page_dirty_in_slot(kvm, slot, gfn);
-	}
-}
-
 static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	kvm_account_pgtable_pages((void *)sp->spt, +1);
@@ -614,8 +596,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
 			      shared);
 	handle_changed_spte_acc_track(old_spte, new_spte, level);
-	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
-				      new_spte, level);
 }
 
 /*
@@ -727,8 +707,6 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
 	handle_changed_spte_acc_track(old_spte, new_spte, level);
-	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
-				      level);
 	return old_spte;
 }
 
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions.
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
                   ` (5 preceding siblings ...)
  2023-02-11  1:46 ` [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Vipin Sharma
@ 2023-02-11  1:46 ` Vipin Sharma
  2023-02-22 19:42   ` David Matlack
  2023-03-17 22:51   ` Sean Christopherson
  2023-03-17 22:57 ` [Patch v3 0/7] Optimize clear dirty log Sean Christopherson
  7 siblings, 2 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-02-11  1:46 UTC (permalink / raw)
  To: seanjc, pbonzini, bgardon, dmatlack; +Cc: kvm, linux-kernel, Vipin Sharma

__handle_changed_pte() and handle_changed_spte_acc_track() are always
used together. Merge these two functions and name the new function
handle_changed_pte(). Remove the existing handle_changed_pte() function
which just calls __handle_changed_pte and
handle_changed_spte_acc_track().

This converges SPTEs change handling code to a single place.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 42 +++++++++++---------------------------
 1 file changed, 12 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 0c031319e901..67538ec48ce5 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -334,17 +334,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 				u64 old_spte, u64 new_spte, int level,
 				bool shared);
 
-static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
-{
-	if (!is_shadow_present_pte(old_spte) || !is_last_spte(old_spte, level))
-		return;
-
-	if (is_accessed_spte(old_spte) &&
-	    (!is_shadow_present_pte(new_spte) || !is_accessed_spte(new_spte) ||
-	     spte_to_pfn(old_spte) != spte_to_pfn(new_spte)))
-		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
-}
-
 static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	kvm_account_pgtable_pages((void *)sp->spt, +1);
@@ -487,7 +476,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
 }
 
 /**
- * __handle_changed_spte - handle bookkeeping associated with an SPTE change
+ * handle_changed_spte - handle bookkeeping associated with an SPTE change
  * @kvm: kvm instance
  * @as_id: the address space of the paging structure the SPTE was a part of
  * @gfn: the base GFN that was mapped by the SPTE
@@ -501,9 +490,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
  * Handle bookkeeping that might result from the modification of a SPTE.
  * This function must be called for all TDP SPTE modifications.
  */
-static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
-				  u64 old_spte, u64 new_spte, int level,
-				  bool shared)
+static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
+				u64 old_spte, u64 new_spte, int level,
+				bool shared)
 {
 	bool was_present = is_shadow_present_pte(old_spte);
 	bool is_present = is_shadow_present_pte(new_spte);
@@ -587,15 +576,10 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 	if (was_present && !was_leaf &&
 	    (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
 		handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared);
-}
 
-static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
-				u64 old_spte, u64 new_spte, int level,
-				bool shared)
-{
-	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
-			      shared);
-	handle_changed_spte_acc_track(old_spte, new_spte, level);
+	if (was_leaf && is_accessed_spte(old_spte) &&
+	    (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
+		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
 }
 
 /*
@@ -638,9 +622,8 @@ static inline int tdp_mmu_set_spte_atomic(struct kvm *kvm,
 	if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte))
 		return -EBUSY;
 
-	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
-			      new_spte, iter->level, true);
-	handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level);
+	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
+			    new_spte, iter->level, true);
 
 	return 0;
 }
@@ -705,8 +688,7 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
 
-	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
-	handle_changed_spte_acc_track(old_spte, new_spte, level);
+	handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
 	return old_spte;
 }
 
@@ -1276,7 +1258,7 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
 	 * Note, when changing a read-only SPTE, it's not strictly necessary to
 	 * zero the SPTE before setting the new PFN, but doing so preserves the
 	 * invariant that the PFN of a present * leaf SPTE can never change.
-	 * See __handle_changed_spte().
+	 * See handle_changed_spte().
 	 */
 	tdp_mmu_iter_set_spte(kvm, iter, 0);
 
@@ -1301,7 +1283,7 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	/*
 	 * No need to handle the remote TLB flush under RCU protection, the
 	 * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
-	 * shadow page.  See the WARN on pfn_changed in __handle_changed_spte().
+	 * shadow page. See the WARN on pfn_changed in handle_changed_spte().
 	 */
 	return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn);
 }
-- 
2.39.1.581.gbfd45094c4-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  2023-02-11  1:46 ` [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Vipin Sharma
@ 2023-02-15 21:10   ` David Matlack
  0 siblings, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-15 21:10 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:22PM -0800, Vipin Sharma wrote:
> Remove bool parameter "record_dirty_log" from __tdp_mmu_set_spte() and
> refactor the code as this variable is always set to true by its caller.
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> ---

Reviewed-by: David Matlack <dmatlack@google.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
  2023-02-11  1:46 ` [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Vipin Sharma
@ 2023-02-15 21:12   ` David Matlack
  2023-03-17 22:59   ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-15 21:12 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:21PM -0800, Vipin Sharma wrote:
> Do atomic-AND to clear the dirty state of SPTEs. Optimize clear-dirty-log
> flow by avoiding to go through __handle_changed_spte() and directly call
> kvm_set_pfn_dirty() instead.
> 
> Atomic-AND allows to fetch the latest value in SPTE, clear only its
> dirty state and set the new SPTE value.  This optimization avoids
> executing unnecessary checks by not calling __handle_changed_spte().
> 
> With the removal of tdp_mmu_set_spte_no_dirty_log(), "record_dirty_log"
> parameter in __tdp_mmu_set_spte() is now obsolete. It will always be set
> to true by its caller. This dead code will be cleaned up in future
> commits.
> 
> Tested on a VM (160 vCPUs, 160 GB memory) and found that performance of clear
> dirty log stage improved by ~40% in dirty_log_perf_test
> 
> Before optimization:
> --------------------
> Iteration 1 clear dirty log time: 3.638543593s
> Iteration 2 clear dirty log time: 3.145032742s
> Iteration 3 clear dirty log time: 3.142340358s
> Clear dirty log over 3 iterations took 9.925916693s. (Avg 3.308638897s/iteration)
> 
> After optimization:
> -------------------
> Iteration 1 clear dirty log time: 2.318988110s
> Iteration 2 clear dirty log time: 1.794470164s
> Iteration 3 clear dirty log time: 1.791668628s
> Clear dirty log over 3 iterations took 5.905126902s. (Avg 1.968375634s/iteration)
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range
  2023-02-11  1:46 ` [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range Vipin Sharma
@ 2023-02-15 21:15   ` David Matlack
  2023-03-21  0:51   ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-15 21:15 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:23PM -0800, Vipin Sharma wrote:
> No need to check all of the conditions in __handle_changed_spte(). Aging
> a gfn range implies resetting access bit or marking spte for access
> tracking.

nit: State what the patch does first.

> 
> Use atomic operation to only reset those bits. This avoids checking many
> conditions in __handle_changed_spte() API. Also, clean up code by
> removing dead code and API parameters.
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>

nits aside,

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 36 +++++++++++++++++++-----------------
>  1 file changed, 19 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index c895560244de..5d6e77554797 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -758,13 +758,6 @@ static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
>  	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
>  }
>  
> -static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
> -						 struct tdp_iter *iter,
> -						 u64 new_spte)
> -{
> -	_tdp_mmu_set_spte(kvm, iter, new_spte, false);
> -}
> -
>  #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
>  	for_each_tdp_pte(_iter, _root, _start, _end)
>  
> @@ -1251,32 +1244,41 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
>  /*
>   * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero
>   * if any of the GFNs in the range have been accessed.
> + *
> + * No need to mark corresponding PFN as accessed as this call is coming from
> + * the clear_young() or clear_flush_young() notifier, which uses the return
> + * value to determine if the page has been accessed.
>   */
>  static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
>  			  struct kvm_gfn_range *range)
>  {
> -	u64 new_spte = 0;
> +	u64 new_spte;
>  
>  	/* If we have a non-accessed entry we don't need to change the pte. */
>  	if (!is_accessed_spte(iter->old_spte))
>  		return false;
>  
> -	new_spte = iter->old_spte;
> -
> -	if (spte_ad_enabled(new_spte)) {
> -		new_spte &= ~shadow_accessed_mask;
> +	if (spte_ad_enabled(iter->old_spte)) {
> +		iter->old_spte  = tdp_mmu_clear_spte_bits(iter->sptep,

nit: Extra space before =

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  2023-02-11  1:46 ` [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Vipin Sharma
@ 2023-02-22 19:31   ` David Matlack
  0 siblings, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-22 19:31 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:24PM -0800, Vipin Sharma wrote:
> Remove bool parameter "record_acc_track" from __tdp_mmu_set_spte() and
> refactor the code. This variable is always set to true by its caller.
> 
> Remove single and double underscore prefix from tdp_mmu_set_spte()

uber-nit: I find it helpful to use phrasing like "Opportunistically do
X" for opportunistic cleanups that are separate from the primary change.
Otherwise the commit message reads as if 2 totally independent changes
are being made.

> related APIs:
> 1. Change __tdp_mmu_set_spte() to tdp_mmu_set_spte()
> 2. Change _tdp_mmu_set_spte() to tdp_mmu_iter_set_spte()
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> ---

Reviewed-by: David Matlack <dmatlack@google.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  2023-02-11  1:46 ` [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Vipin Sharma
@ 2023-02-22 19:36   ` David Matlack
  0 siblings, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-22 19:36 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:25PM -0800, Vipin Sharma wrote:
> Remove handle_changed_spte_dirty_log() as there is no code flow which
> sets 4KiB SPTE writable and hit this path. This function marks the page
> dirty in a memslot only if new SPTE is 4KiB in size and writable.
> 
> Current users of handle_changed_spte_dirty_log() are:
> 1. set_spte_gfn() - Create only non writable SPTEs.
> 2. write_protect_gfn() - Change an SPTE to non writable.
> 3. zap leaf and roots APIs - Everything is 0.
> 4. handle_removed_pt() - Sets SPTEs to REMOVED_SPTE
> 5. tdp_mmu_link_sp() - Makes non leaf SPTEs.
> 
> There is also no path which creates a writable 4KiB without going
> through make_spte() and this functions takes care of marking SPTE dirty
> in the memslot if it is PT_WRITABLE.
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>

Aside from the comment suggestion,

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 22 ----------------------
>  1 file changed, 22 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index e50e869bf879..0c031319e901 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -345,24 +345,6 @@ static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
>  		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
>  }
>  
> -static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
> -					  u64 old_spte, u64 new_spte, int level)
> -{
> -	bool pfn_changed;
> -	struct kvm_memory_slot *slot;
> -
> -	if (level > PG_LEVEL_4K)
> -		return;
> -
> -	pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
> -
> -	if ((!is_writable_pte(old_spte) || pfn_changed) &&
> -	    is_writable_pte(new_spte)) {
> -		slot = __gfn_to_memslot(__kvm_memslots(kvm, as_id), gfn);
> -		mark_page_dirty_in_slot(kvm, slot, gfn);
> -	}
> -}
> -
>  static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
>  {
>  	kvm_account_pgtable_pages((void *)sp->spt, +1);
> @@ -614,8 +596,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>  	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
>  			      shared);
>  	handle_changed_spte_acc_track(old_spte, new_spte, level);
> -	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
> -				      new_spte, level);

Please leave a comment somewhere in handle_changed_spte() to document
why mark_page_dirty_in_slot() is not needed to help future readers and
in case something changes in the future.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions.
  2023-02-11  1:46 ` [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions Vipin Sharma
@ 2023-02-22 19:42   ` David Matlack
  2023-03-17 22:51   ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: David Matlack @ 2023-02-22 19:42 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: seanjc, pbonzini, bgardon, kvm, linux-kernel

On Fri, Feb 10, 2023 at 05:46:26PM -0800, Vipin Sharma wrote:
> __handle_changed_pte() and handle_changed_spte_acc_track() are always
> used together. Merge these two functions and name the new function

nit: State what the patch does first.

> handle_changed_pte(). Remove the existing handle_changed_pte() function
> which just calls __handle_changed_pte and
> handle_changed_spte_acc_track().

s/_pte/_spte/

> 
> This converges SPTEs change handling code to a single place.
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> Reviewed-by: Ben Gardon <bgardon@google.com>

Nice cleanup! Commit message nits aside,

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 42 +++++++++++---------------------------
>  1 file changed, 12 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 0c031319e901..67538ec48ce5 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -334,17 +334,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>  				u64 old_spte, u64 new_spte, int level,
>  				bool shared);
>  
> -static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
> -{
> -	if (!is_shadow_present_pte(old_spte) || !is_last_spte(old_spte, level))
> -		return;
> -
> -	if (is_accessed_spte(old_spte) &&
> -	    (!is_shadow_present_pte(new_spte) || !is_accessed_spte(new_spte) ||
> -	     spte_to_pfn(old_spte) != spte_to_pfn(new_spte)))
> -		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
> -}
> -
>  static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
>  {
>  	kvm_account_pgtable_pages((void *)sp->spt, +1);
> @@ -487,7 +476,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>  }
>  
>  /**
> - * __handle_changed_spte - handle bookkeeping associated with an SPTE change
> + * handle_changed_spte - handle bookkeeping associated with an SPTE change
>   * @kvm: kvm instance
>   * @as_id: the address space of the paging structure the SPTE was a part of
>   * @gfn: the base GFN that was mapped by the SPTE
> @@ -501,9 +490,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>   * Handle bookkeeping that might result from the modification of a SPTE.
>   * This function must be called for all TDP SPTE modifications.
>   */
> -static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> -				  u64 old_spte, u64 new_spte, int level,
> -				  bool shared)
> +static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> +				u64 old_spte, u64 new_spte, int level,
> +				bool shared)
>  {
>  	bool was_present = is_shadow_present_pte(old_spte);
>  	bool is_present = is_shadow_present_pte(new_spte);
> @@ -587,15 +576,10 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>  	if (was_present && !was_leaf &&
>  	    (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
>  		handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared);
> -}
>  
> -static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> -				u64 old_spte, u64 new_spte, int level,
> -				bool shared)
> -{
> -	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
> -			      shared);
> -	handle_changed_spte_acc_track(old_spte, new_spte, level);
> +	if (was_leaf && is_accessed_spte(old_spte) &&
> +	    (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
> +		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
>  }
>  
>  /*
> @@ -638,9 +622,8 @@ static inline int tdp_mmu_set_spte_atomic(struct kvm *kvm,
>  	if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte))
>  		return -EBUSY;
>  
> -	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> -			      new_spte, iter->level, true);
> -	handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level);
> +	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> +			    new_spte, iter->level, true);
>  
>  	return 0;
>  }
> @@ -705,8 +688,7 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
>  
>  	old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
>  
> -	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
> -	handle_changed_spte_acc_track(old_spte, new_spte, level);
> +	handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
>  	return old_spte;
>  }
>  
> @@ -1276,7 +1258,7 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
>  	 * Note, when changing a read-only SPTE, it's not strictly necessary to
>  	 * zero the SPTE before setting the new PFN, but doing so preserves the
>  	 * invariant that the PFN of a present * leaf SPTE can never change.
> -	 * See __handle_changed_spte().
> +	 * See handle_changed_spte().
>  	 */
>  	tdp_mmu_iter_set_spte(kvm, iter, 0);
>  
> @@ -1301,7 +1283,7 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  	/*
>  	 * No need to handle the remote TLB flush under RCU protection, the
>  	 * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
> -	 * shadow page.  See the WARN on pfn_changed in __handle_changed_spte().
> +	 * shadow page. See the WARN on pfn_changed in handle_changed_spte().
>  	 */
>  	return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn);
>  }
> -- 
> 2.39.1.581.gbfd45094c4-goog
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions.
  2023-02-11  1:46 ` [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions Vipin Sharma
  2023-02-22 19:42   ` David Matlack
@ 2023-03-17 22:51   ` Sean Christopherson
  2023-03-17 23:48     ` Vipin Sharma
  1 sibling, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-03-17 22:51 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Feb 10, 2023, Vipin Sharma wrote:
> @@ -1301,7 +1283,7 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  	/*
>  	 * No need to handle the remote TLB flush under RCU protection, the
>  	 * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
> -	 * shadow page.  See the WARN on pfn_changed in __handle_changed_spte().
> +	 * shadow page. See the WARN on pfn_changed in handle_changed_spte().

I was just starting to think you're an ok person, and then I find out you're a
heretic that only puts a single space after periods.  ;-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
                   ` (6 preceding siblings ...)
  2023-02-11  1:46 ` [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions Vipin Sharma
@ 2023-03-17 22:57 ` Sean Christopherson
  2023-03-17 23:51   ` Vipin Sharma
  2023-03-21  0:41   ` Sean Christopherson
  7 siblings, 2 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-03-17 22:57 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Feb 10, 2023, Vipin Sharma wrote:
> This patch series has optimized control flow of clearing dirty log and
> improved its performance by ~40% (2% more than v2).
> 
> It also got rid of many variants of the handle_changed_spte family of
> functions and converged logic to one handle_changed_spte() function. It
> also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> booleans for controlling them.
>
> v3:
> - Tried to do better job at writing commit messages.

LOL, that's the spirit!

Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
various nits when applying.  I'll also add a link in patch 2 to the discussion
about why we determined that bypassing __tdp_mmu_set_spte() is safe; that's critical
information that isn't captured in the changelog.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
  2023-02-11  1:46 ` [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Vipin Sharma
  2023-02-15 21:12   ` David Matlack
@ 2023-03-17 22:59   ` Sean Christopherson
  2023-03-17 23:50     ` Vipin Sharma
  1 sibling, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-03-17 22:59 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Feb 10, 2023, Vipin Sharma wrote:
> @@ -1677,8 +1670,13 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
>  static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
>  				  gfn_t gfn, unsigned long mask, bool wrprot)
>  {
> +	/*
> +	 * Either all SPTEs in TDP MMU will need write protection or none. This
> +	 * contract will not be modified for TDP MMU pages.
> +	 */
> +	u64 clear_bit = (wrprot || !kvm_ad_enabled()) ? PT_WRITABLE_MASK :
> +							shadow_dirty_mask;

Switching from spte_ad_need_write_protect() to kvm_ad_enabled() belongs in a
separate.  In the unlikely event that the above assertion/contracts is invalid,
then any issues should bisect to the switch, not to a much more complex patch.

I'll make that happen when applying.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions.
  2023-03-17 22:51   ` Sean Christopherson
@ 2023-03-17 23:48     ` Vipin Sharma
  0 siblings, 0 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-03-17 23:48 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Mar 17, 2023 at 3:51 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > @@ -1301,7 +1283,7 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> >       /*
> >        * No need to handle the remote TLB flush under RCU protection, the
> >        * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
> > -      * shadow page.  See the WARN on pfn_changed in __handle_changed_spte().
> > +      * shadow page. See the WARN on pfn_changed in handle_changed_spte().
>
> I was just starting to think you're an ok person, and then I find out you're a
> heretic that only puts a single space after periods.  ;-)

I know, I know, I can't help it. I just love the way single space
looks after periods.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
  2023-03-17 22:59   ` Sean Christopherson
@ 2023-03-17 23:50     ` Vipin Sharma
  0 siblings, 0 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-03-17 23:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Mar 17, 2023 at 3:59 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > @@ -1677,8 +1670,13 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
> >  static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
> >                                 gfn_t gfn, unsigned long mask, bool wrprot)
> >  {
> > +     /*
> > +      * Either all SPTEs in TDP MMU will need write protection or none. This
> > +      * contract will not be modified for TDP MMU pages.
> > +      */
> > +     u64 clear_bit = (wrprot || !kvm_ad_enabled()) ? PT_WRITABLE_MASK :
> > +                                                     shadow_dirty_mask;
>
> Switching from spte_ad_need_write_protect() to kvm_ad_enabled() belongs in a
> separate.  In the unlikely event that the above assertion/contracts is invalid,
> then any issues should bisect to the switch, not to a much more complex patch.
>
> I'll make that happen when applying.

Make sense, thanks!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-17 22:57 ` [Patch v3 0/7] Optimize clear dirty log Sean Christopherson
@ 2023-03-17 23:51   ` Vipin Sharma
  2023-03-21  0:41   ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: Vipin Sharma @ 2023-03-17 23:51 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Mar 17, 2023 at 3:57 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > This patch series has optimized control flow of clearing dirty log and
> > improved its performance by ~40% (2% more than v2).
> >
> > It also got rid of many variants of the handle_changed_spte family of
> > functions and converged logic to one handle_changed_spte() function. It
> > also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> > booleans for controlling them.
> >
> > v3:
> > - Tried to do better job at writing commit messages.
>
> LOL, that's the spirit!
>
> Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> various nits when applying.  I'll also add a link in patch 2 to the discussion

Yeah, he is too demanding! :p

> about why we determined that bypassing __tdp_mmu_set_spte() is safe; that's critical
> information that isn't captured in the changelog.

Thanks!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-17 22:57 ` [Patch v3 0/7] Optimize clear dirty log Sean Christopherson
  2023-03-17 23:51   ` Vipin Sharma
@ 2023-03-21  0:41   ` Sean Christopherson
  2023-03-21 18:11     ` Vipin Sharma
  1 sibling, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2023-03-21  0:41 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Mar 17, 2023, Sean Christopherson wrote:
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > This patch series has optimized control flow of clearing dirty log and
> > improved its performance by ~40% (2% more than v2).
> > 
> > It also got rid of many variants of the handle_changed_spte family of
> > functions and converged logic to one handle_changed_spte() function. It
> > also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> > booleans for controlling them.
> >
> > v3:
> > - Tried to do better job at writing commit messages.
> 
> LOL, that's the spirit!
> 
> Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> various nits when applying.

Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
but damage done, and it's my fault for throwing you a big blob of code in the
first place.

I ended up splitting the "interesting" patches into three each:

  1. Switch to the atomic-AND
  2. Drop the access-tracking / dirty-logging (as appropriate)
  3. Drop the call to __handle_changed_spte()

because logically they are three different things (although obviously related).

I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
can take a look in the meantime to make sure I didn't do something completely
boneheaded, it'd be much appreciated.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range
  2023-02-11  1:46 ` [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range Vipin Sharma
  2023-02-15 21:15   ` David Matlack
@ 2023-03-21  0:51   ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-03-21  0:51 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Fri, Feb 10, 2023, Vipin Sharma wrote:
>  	} else {
> +		new_spte = mark_spte_for_access_track(iter->old_spte);
> +		iter->old_spte = kvm_tdp_mmu_write_spte(iter->sptep,
> +							iter->old_spte, new_spte,
> +							iter->level);
>  		/*
>  		 * Capture the dirty status of the page, so that it doesn't get
>  		 * lost when the SPTE is marked for access tracking.
>  		 */
> -		if (is_writable_pte(new_spte))
> -			kvm_set_pfn_dirty(spte_to_pfn(new_spte));
> -
> -		new_spte = mark_spte_for_access_track(new_spte);
> +		if (is_writable_pte(iter->old_spte))
> +			kvm_set_pfn_dirty(spte_to_pfn(iter->old_spte));

Moving this block below kvm_tdp_mmu_write_spte() is an unrelated change.  Much to
my chagrin, I discovered that past me gave you this code.  I still think the change
is correct, but I dropped it for now, mostly because the legacy/shadow MMU has the
same pattern (marks the PFN dirty before setting the SPTE).

I think this might actually be a bug fix, e.g. if the XCHG races with a fast page
fault fix and drops the Writable bit, the CPU could insert writable entry into the
TLB without KVM invoking kvm_set_pfn_dirty().  But I'm not 100% confident that I'm
not missing something, and _if_ there's a bug then mmu_spte_age() needs the same
fix, so for now, I dropped it.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-21  0:41   ` Sean Christopherson
@ 2023-03-21 18:11     ` Vipin Sharma
  2023-03-21 19:57       ` Sean Christopherson
  0 siblings, 1 reply; 26+ messages in thread
From: Vipin Sharma @ 2023-03-21 18:11 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > various nits when applying.
>
> Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> but damage done, and it's my fault for throwing you a big blob of code in the
> first place.
>
> I ended up splitting the "interesting" patches into three each:
>
>   1. Switch to the atomic-AND
>   2. Drop the access-tracking / dirty-logging (as appropriate)
>   3. Drop the call to __handle_changed_spte()
>
> because logically they are three different things (although obviously related).
>
> I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> can take a look in the meantime to make sure I didn't do something completely
> boneheaded, it'd be much appreciated.


Thanks for refactoring the patches. I reviewed the commits, no obvious
red flags from my side. Few small nits I found:

commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
if TDP MMU SPTEs need wrprot")
 - kvm_ad_enabled() should be outside the loop.

commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
in the clear-dirty-log flow")
 - MMU_WARN_ON(kvm_ad_enabled() &&
spte_ad_need_write_protect(iter.old_spte) should be after
if(iter.level > PG_LEVEL_4k...)

commit 93c375bb6aea ("KVM: x86/mmu: Bypass __handle_changed_spte()
when clearing TDP MMU dirty bits")
 - Needs new performance numbers. Adding MMU_WARN_ON() might change
numbers. I will run a perf test on your mmu branch and see if
something changes a lot.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-21 18:11     ` Vipin Sharma
@ 2023-03-21 19:57       ` Sean Christopherson
  2023-03-21 20:40         ` Sean Christopherson
  2023-03-21 21:38         ` Sean Christopherson
  0 siblings, 2 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-03-21 19:57 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Tue, Mar 21, 2023, Vipin Sharma wrote:
> On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > > various nits when applying.
> >
> > Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> > but damage done, and it's my fault for throwing you a big blob of code in the
> > first place.
> >
> > I ended up splitting the "interesting" patches into three each:
> >
> >   1. Switch to the atomic-AND
> >   2. Drop the access-tracking / dirty-logging (as appropriate)
> >   3. Drop the call to __handle_changed_spte()
> >
> > because logically they are three different things (although obviously related).
> >
> > I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> > sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> > can take a look in the meantime to make sure I didn't do something completely
> > boneheaded, it'd be much appreciated.
> 
> 
> Thanks for refactoring the patches. I reviewed the commits, no obvious
> red flags from my side. Few small nits I found:
> 
> commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
> if TDP MMU SPTEs need wrprot")
>  - kvm_ad_enabled() should be outside the loop.

Hmm, I deliberately left it inside the loop, but I agree that it would be better
to hoist it out in that commit.

> commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
> in the clear-dirty-log flow")
>  - MMU_WARN_ON(kvm_ad_enabled() &&
> spte_ad_need_write_protect(iter.old_spte) should be after
> if(iter.level > PG_LEVEL_4k...)

Ah, hrm.  This was also deliberate, but looking at the diff I agree that relative
to the diff, it's an unnecessary/unrelated change.  I think what I'll do is
land the assertion above the "if (iter.level > PG_LEVEL_4K ||" in the above
commit that switches to kvm_ad_enabled().  That way there shouldn't be any change
for the assertion in this commit.

> commit 93c375bb6aea ("KVM: x86/mmu: Bypass __handle_changed_spte()
> when clearing TDP MMU dirty bits")
>  - Needs new performance numbers. Adding MMU_WARN_ON() might change
> numbers. I will run a perf test on your mmu branch and see if
> something changes a lot.

It won't.  MMU_WARN_ON() is dead code without manual modification to define MMU_DEBUG.
Part of the reason I used MMU_WARN_ON() was to remind myself to send a patch/series
to overhaul MMU_WARN_ON[*].  My thought/hope is that a Kconfig will allow developers
and testers to run with a pile of assertions and sanity checks without impacting
the runtime overhead for production builds.

[*] https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-21 19:57       ` Sean Christopherson
@ 2023-03-21 20:40         ` Sean Christopherson
  2023-03-21 21:38         ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-03-21 20:40 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Tue, Mar 21, 2023, Sean Christopherson wrote:
> On Tue, Mar 21, 2023, Vipin Sharma wrote:
> > On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > > > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > > > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > > > various nits when applying.
> > >
> > > Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> > > but damage done, and it's my fault for throwing you a big blob of code in the
> > > first place.
> > >
> > > I ended up splitting the "interesting" patches into three each:
> > >
> > >   1. Switch to the atomic-AND
> > >   2. Drop the access-tracking / dirty-logging (as appropriate)
> > >   3. Drop the call to __handle_changed_spte()
> > >
> > > because logically they are three different things (although obviously related).
> > >
> > > I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> > > sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> > > can take a look in the meantime to make sure I didn't do something completely
> > > boneheaded, it'd be much appreciated.
> > 
> > 
> > Thanks for refactoring the patches. I reviewed the commits, no obvious
> > red flags from my side. Few small nits I found:
> > 
> > commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
> > if TDP MMU SPTEs need wrprot")
> >  - kvm_ad_enabled() should be outside the loop.
> 
> Hmm, I deliberately left it inside the loop, but I agree that it would be better
> to hoist it out in that commit.
> 
> > commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
> > in the clear-dirty-log flow")
> >  - MMU_WARN_ON(kvm_ad_enabled() &&
> > spte_ad_need_write_protect(iter.old_spte) should be after
> > if(iter.level > PG_LEVEL_4k...)
> 
> Ah, hrm.  This was also deliberate, but looking at the diff I agree that relative
> to the diff, it's an unnecessary/unrelated change.  I think what I'll do is
> land the assertion above the "if (iter.level > PG_LEVEL_4K ||" in the above
> commit that switches to kvm_ad_enabled().  That way there shouldn't be any change
> for the assertion in this commit.

Aha!  Even better, split this into yet one more patch to dedup the guts before
switching to the atomic-AND, and give clear_dirty_gfn_range() the same treatment.
That further isolates the changes, provides solid justification for hoisting the
kvm_ad_enabled() check out of the loop (it's basically guaranteed to be a single
memory read that hits the L1), and keeps clear_dirty_gfn_range() and
clear_dirty_pt_masked() as similar as is reasonably possible.

Speaking of which, I'll send a patch to remove the redundant is_shadow_present_pte()
check in clear_dirty_gfn_range(), that's already handled by tdp_root_for_each_leaf_pte().

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch v3 0/7] Optimize clear dirty log
  2023-03-21 19:57       ` Sean Christopherson
  2023-03-21 20:40         ` Sean Christopherson
@ 2023-03-21 21:38         ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2023-03-21 21:38 UTC (permalink / raw)
  To: Vipin Sharma; +Cc: pbonzini, bgardon, dmatlack, kvm, linux-kernel

On Tue, Mar 21, 2023, Sean Christopherson wrote:
> It won't.  MMU_WARN_ON() is dead code without manual modification to define MMU_DEBUG.
> Part of the reason I used MMU_WARN_ON() was to remind myself to send a patch/series
> to overhaul MMU_WARN_ON[*].  My thought/hope is that a Kconfig will allow developers
> and testers to run with a pile of assertions and sanity checks without impacting
> the runtime overhead for production builds.
> 
> [*] https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com/

Ugh, I'm definitely sending that patch, MMU_DEBUG has bitrotted and broken the
build yet again.

arch/x86/kvm/mmu/mmu.c: In function ‘kvm_mmu_free_shadow_page’:
arch/x86/kvm/mmu/mmu.c:1738:15: error: implicit declaration of function ‘is_empty_shadow_page’; did you mean ‘to_shadow_page’? [-Werror=implicit-function-declaration]
 1738 |  MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
      |               ^~~~~~~~~~~~~~~~~~~~
include/asm-generic/bug.h:110:25: note: in definition of macro ‘WARN_ON_ONCE’
  110 |  int __ret_warn_on = !!(condition);   \
      |                         ^~~~~~~~~
arch/x86/kvm/mmu/mmu.c:1738:2: note: in expansion of macro ‘MMU_WARN_ON’
 1738 |  MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
      |  ^~~~~~~~~~~


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-03-21 21:38 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-11  1:46 [Patch v3 0/7] Optimize clear dirty log Vipin Sharma
2023-02-11  1:46 ` [Patch v3 1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Vipin Sharma
2023-02-11  1:46 ` [Patch v3 2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Vipin Sharma
2023-02-15 21:12   ` David Matlack
2023-03-17 22:59   ` Sean Christopherson
2023-03-17 23:50     ` Vipin Sharma
2023-02-11  1:46 ` [Patch v3 3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Vipin Sharma
2023-02-15 21:10   ` David Matlack
2023-02-11  1:46 ` [Patch v3 4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range Vipin Sharma
2023-02-15 21:15   ` David Matlack
2023-03-21  0:51   ` Sean Christopherson
2023-02-11  1:46 ` [Patch v3 5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Vipin Sharma
2023-02-22 19:31   ` David Matlack
2023-02-11  1:46 ` [Patch v3 6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Vipin Sharma
2023-02-22 19:36   ` David Matlack
2023-02-11  1:46 ` [Patch v3 7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions Vipin Sharma
2023-02-22 19:42   ` David Matlack
2023-03-17 22:51   ` Sean Christopherson
2023-03-17 23:48     ` Vipin Sharma
2023-03-17 22:57 ` [Patch v3 0/7] Optimize clear dirty log Sean Christopherson
2023-03-17 23:51   ` Vipin Sharma
2023-03-21  0:41   ` Sean Christopherson
2023-03-21 18:11     ` Vipin Sharma
2023-03-21 19:57       ` Sean Christopherson
2023-03-21 20:40         ` Sean Christopherson
2023-03-21 21:38         ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).