linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log
@ 2023-03-21 22:00 Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Sean Christopherson
                   ` (14 more replies)
  0 siblings, 15 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

This is a massaged version of Vipin's series to optimize clearing dirty
state in the TDP MMU.  It's basically the same as v3, just spread out over
more patches.  The only meaningful difference in the end is that
clear_dirty_gfn_range() also gets similar treatment in handling Dirty vs.
Writable logic.

Vipin, I'm still planning on applying this for 6.4, but the changes ended
up being a wee bit bigger than I'm comfortable making on the fly, thus the
formal posting.

v4:
- Split patches into more fine-grained chunks.
- Massage changelogs as needed.
- Collect reviews. [David]

v3:
- https://lore.kernel.org/all/20230211014626.3659152-1-vipinsh@google.com
- Tried to do better job at writing commit messages.
- Made kvm_tdp_mmu_clear_spte_bits() similar to the kvm_tdp_mmu_write_spte().
- clear_dirty_pt_masked() evaluates mask for the bit to be cleared outside the
  loop and use that for all of the SPTEs instead of calculating for each SPTE.
- Some naming changes based on the feedbacks.
- Split out the dead code clean from the optimization code.


v2: https://lore.kernel.org/lkml/20230203192822.106773-1-vipinsh@google.com/
- Clear dirty log and age gfn range does not go through
  handle_changed_spte, they handle their SPTE changes locally to improve
  their speed.
- Clear only specific bits atomically when updating SPTEs in clearing
  dirty log and aging gfn range functions.
- Removed tdp_mmu_set_spte_no_[acc_track|dirty_log] APIs.
- Converged all handle_changed_spte related functions to one place.

v1: https://lore.kernel.org/lkml/20230125213857.824959-1-vipinsh@google.com

Vipin Sharma (13):
  KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic
    write
  KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need
    wrprot
  KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU
  KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log
    flow
  KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty
    bits
  KVM: x86/mmu: Bypass __handle_changed_spte() when clearing TDP MMU
    dirty bits
  KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs
  KVM: x86/mmu: Drop unnecessary dirty log checks when aging TDP MMU
    SPTEs
  KVM: x86/mmu: Bypass __handle_changed_spte() when aging TDP MMU SPTEs
  KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  KVM: x86/mmu: Merge all handle_changed_pte*() functions

 arch/x86/kvm/mmu/tdp_iter.h |  48 +++++---
 arch/x86/kvm/mmu/tdp_mmu.c  | 215 ++++++++++++------------------------
 2 files changed, 106 insertions(+), 157 deletions(-)


base-commit: f3d90f901d18749dca096719540a075f59240051
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4 01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 02/13] KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need wrprot Sean Christopherson
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Move conditions in kvm_tdp_mmu_write_spte() to check if an SPTE should
be written atomically or not to a separate function.

This new function, kvm_tdp_mmu_spte_need_atomic_write(),  will be used
in future commits to optimize clearing bits in SPTEs.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_iter.h | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index f0af385c56e0..c11c5d00b2c1 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -29,23 +29,29 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte)
 	WRITE_ONCE(*rcu_dereference(sptep), new_spte);
 }
 
+/*
+ * SPTEs must be modified atomically if they are shadow-present, leaf
+ * SPTEs, and have volatile bits, i.e. has bits that can be set outside
+ * of mmu_lock.  The Writable bit can be set by KVM's fast page fault
+ * handler, and Accessed and Dirty bits can be set by the CPU.
+ *
+ * Note, non-leaf SPTEs do have Accessed bits and those bits are
+ * technically volatile, but KVM doesn't consume the Accessed bit of
+ * non-leaf SPTEs, i.e. KVM doesn't care if it clobbers the bit.  This
+ * logic needs to be reassessed if KVM were to use non-leaf Accessed
+ * bits, e.g. to skip stepping down into child SPTEs when aging SPTEs.
+ */
+static inline bool kvm_tdp_mmu_spte_need_atomic_write(u64 old_spte, int level)
+{
+	return is_shadow_present_pte(old_spte) &&
+	       is_last_spte(old_spte, level) &&
+	       spte_has_volatile_bits(old_spte);
+}
+
 static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
 					 u64 new_spte, int level)
 {
-	/*
-	 * Atomically write the SPTE if it is a shadow-present, leaf SPTE with
-	 * volatile bits, i.e. has bits that can be set outside of mmu_lock.
-	 * The Writable bit can be set by KVM's fast page fault handler, and
-	 * Accessed and Dirty bits can be set by the CPU.
-	 *
-	 * Note, non-leaf SPTEs do have Accessed bits and those bits are
-	 * technically volatile, but KVM doesn't consume the Accessed bit of
-	 * non-leaf SPTEs, i.e. KVM doesn't care if it clobbers the bit.  This
-	 * logic needs to be reassessed if KVM were to use non-leaf Accessed
-	 * bits, e.g. to skip stepping down into child SPTEs when aging SPTEs.
-	 */
-	if (is_shadow_present_pte(old_spte) && is_last_spte(old_spte, level) &&
-	    spte_has_volatile_bits(old_spte))
+	if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level))
 		return kvm_tdp_mmu_write_spte_atomic(sptep, new_spte);
 
 	__kvm_tdp_mmu_write_spte(sptep, new_spte);
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 02/13] KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need wrprot
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 03/13] KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU Sean Christopherson
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Use the constant-after-module-load kvm_ad_enabled() to check if SPTEs in
the TDP MMU need to be write-protected when clearing accessed/dirty status
instead of manually checking every SPTE.  The per-SPTE A/D enabling is
specific to nested EPT MMUs, i.e. when KVM is using EPT A/D bits but L1 is
not, and so cannot happen in the TDP MMU (which is non-nested only).

Keep the original code as sanity checks buried under MMU_WARN_ON().
MMU_WARN_ON() is more or less useless at the moment, but there are plans
to change that.

Link: https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
[sean: split to separate patch, apply to dirty path, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 7c25dbf32ecc..5a5642650c3e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1621,7 +1621,10 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 		if (!is_shadow_present_pte(iter.old_spte))
 			continue;
 
-		if (spte_ad_need_write_protect(iter.old_spte)) {
+		MMU_WARN_ON(kvm_ad_enabled() &&
+			    spte_ad_need_write_protect(iter.old_spte));
+
+		if (!kvm_ad_enabled()) {
 			if (is_writable_pte(iter.old_spte))
 				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
 			else
@@ -1685,13 +1688,16 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 		if (!mask)
 			break;
 
+		MMU_WARN_ON(kvm_ad_enabled() &&
+			    spte_ad_need_write_protect(iter.old_spte));
+
 		if (iter.level > PG_LEVEL_4K ||
 		    !(mask & (1UL << (iter.gfn - gfn))))
 			continue;
 
 		mask &= ~(1UL << (iter.gfn - gfn));
 
-		if (wrprot || spte_ad_need_write_protect(iter.old_spte)) {
+		if (wrprot || !kvm_ad_enabled()) {
 			if (is_writable_pte(iter.old_spte))
 				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
 			else
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 03/13] KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 02/13] KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need wrprot Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 04/13] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Sean Christopherson
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Deduplicate the guts of the TDP MMU's clearing of dirty status by
snapshotting whether to check+clear the Dirty bit vs. the Writable bit,
which is the only difference between the two flavors of dirty tracking.

Note, kvm_ad_enabled() is just a wrapper for shadow_accessed_mask, i.e.
is constant after kvm-{intel,amd}.ko is loaded.

Link: https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
[sean: split to separate patch, apply to dirty log, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 35 +++++++++--------------------------
 1 file changed, 9 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 5a5642650c3e..b32c9ba05c89 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1607,8 +1607,8 @@ void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm,
 static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 			   gfn_t start, gfn_t end)
 {
+	u64 dbit = kvm_ad_enabled() ? shadow_dirty_mask : PT_WRITABLE_MASK;
 	struct tdp_iter iter;
-	u64 new_spte;
 	bool spte_set = false;
 
 	rcu_read_lock();
@@ -1624,19 +1624,10 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 		MMU_WARN_ON(kvm_ad_enabled() &&
 			    spte_ad_need_write_protect(iter.old_spte));
 
-		if (!kvm_ad_enabled()) {
-			if (is_writable_pte(iter.old_spte))
-				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
-			else
-				continue;
-		} else {
-			if (iter.old_spte & shadow_dirty_mask)
-				new_spte = iter.old_spte & ~shadow_dirty_mask;
-			else
-				continue;
-		}
+		if (!(iter.old_spte & dbit))
+			continue;
 
-		if (tdp_mmu_set_spte_atomic(kvm, &iter, new_spte))
+		if (tdp_mmu_set_spte_atomic(kvm, &iter, iter.old_spte & ~dbit))
 			goto retry;
 
 		spte_set = true;
@@ -1678,8 +1669,9 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
 static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 				  gfn_t gfn, unsigned long mask, bool wrprot)
 {
+	u64 dbit = (wrprot || !kvm_ad_enabled()) ? PT_WRITABLE_MASK :
+						   shadow_dirty_mask;
 	struct tdp_iter iter;
-	u64 new_spte;
 
 	rcu_read_lock();
 
@@ -1697,19 +1689,10 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		mask &= ~(1UL << (iter.gfn - gfn));
 
-		if (wrprot || !kvm_ad_enabled()) {
-			if (is_writable_pte(iter.old_spte))
-				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
-			else
-				continue;
-		} else {
-			if (iter.old_spte & shadow_dirty_mask)
-				new_spte = iter.old_spte & ~shadow_dirty_mask;
-			else
-				continue;
-		}
+		if (!(iter.old_spte & dbit))
+			continue;
 
-		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
+		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, iter.old_spte & ~dbit);
 	}
 
 	rcu_read_unlock();
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 04/13] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (2 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 03/13] KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 05/13] KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty bits Sean Christopherson
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Optimize the clearing of dirty state in TDP MMU SPTEs by doing an
atomic-AND (on SPTEs that have volatile bits) instead of the full XCHG
that currently ends up being invoked (see kvm_tdp_mmu_write_spte()).
Clearing _only_ the bit in question will allow KVM to skip the many
irrelevant checks in __handle_changed_spte() by avoiding any collateral
damage due to the XCHG writing all SPTE bits, e.g. the XCHG could race
with fast_page_fault() setting the W-bit and the CPU setting the D-bit,
and thus incorrectly drop the CPU's D-bit update.

Link: https://lore.kernel.org/all/Y9hXmz%2FnDOr1hQal@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: split the switch to atomic-AND to a separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_iter.h | 14 ++++++++++++++
 arch/x86/kvm/mmu/tdp_mmu.c  | 16 ++++++++--------
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index c11c5d00b2c1..fae559559a80 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -58,6 +58,20 @@ static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
 	return old_spte;
 }
 
+static inline u64 tdp_mmu_clear_spte_bits(tdp_ptep_t sptep, u64 old_spte,
+					  u64 mask, int level)
+{
+	atomic64_t *sptep_atomic;
+
+	if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) {
+		sptep_atomic = (atomic64_t *)rcu_dereference(sptep);
+		return (u64)atomic64_fetch_and(~mask, sptep_atomic);
+	}
+
+	__kvm_tdp_mmu_write_spte(sptep, old_spte & ~mask);
+	return old_spte;
+}
+
 /*
  * A TDP iterator performs a pre-order walk over a TDP paging structure.
  */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index b32c9ba05c89..a70cc1dae18a 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -770,13 +770,6 @@ static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
 	_tdp_mmu_set_spte(kvm, iter, new_spte, false, true);
 }
 
-static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
-						 struct tdp_iter *iter,
-						 u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true, false);
-}
-
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
 	for_each_tdp_pte(_iter, _root, _start, _end)
 
@@ -1692,7 +1685,14 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 		if (!(iter.old_spte & dbit))
 			continue;
 
-		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, iter.old_spte & ~dbit);
+		iter.old_spte = tdp_mmu_clear_spte_bits(iter.sptep,
+							iter.old_spte, dbit,
+							iter.level);
+
+		__handle_changed_spte(kvm, iter.as_id, iter.gfn, iter.old_spte,
+				      iter.old_spte & ~dbit, iter.level, false);
+		handle_changed_spte_acc_track(iter.old_spte, iter.old_spte & ~dbit,
+					      iter.level);
 	}
 
 	rcu_read_unlock();
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 05/13] KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty bits
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (3 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 04/13] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Drop the unnecessary call to handle access-tracking changes when clearing
the dirty status of TDP MMU SPTEs.  Neither the Dirty bit nor the Writable
bit has any impact on the accessed state of a page, i.e. clearing only
the aforementioned bits doesn't make an accessed SPTE suddently not
accessed.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
[sean: split to separate patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index a70cc1dae18a..950c5d23ecee 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1691,8 +1691,6 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		__handle_changed_spte(kvm, iter.as_id, iter.gfn, iter.old_spte,
 				      iter.old_spte & ~dbit, iter.level, false);
-		handle_changed_spte_acc_track(iter.old_spte, iter.old_spte & ~dbit,
-					      iter.level);
 	}
 
 	rcu_read_unlock();
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() when clearing TDP MMU dirty bits
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (4 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 05/13] KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty bits Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-06-25  7:44   ` Like Xu
  2023-03-21 22:00 ` [PATCH v4 07/13] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Sean Christopherson
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Drop everything except marking the PFN dirty and the relevant tracepoint
parts of __handle_changed_spte() when clearing the dirty status of gfns in
the TDP MMU.  Clearing only the Dirty (or Writable) bit doesn't affect
the SPTEs shadow-present status, whether or not the SPTE is a leaf, or
change the SPTE's PFN.  I.e. other than marking the PFN dirty, none of the
functional updates handled by __handle_changed_spte() are relevant.

Losing __handle_changed_spte()'s sanity checks does mean that a bug could
theoretical go unnoticed, but that scenario is extremely unlikely, e.g.
would effectively require a misconfigured or a locking bug elsewhere.

Opportunistically remove a comment blurb from __handle_changed_spte()
about all modifications to TDP MMU SPTEs needing to invoke said function,
that "rule" hasn't been true since fast page fault support was added for
the TDP MMU (and perhaps even before).

Tested on a VM (160 vCPUs, 160 GB memory) and found that performance of
clear dirty log stage improved by ~40% in dirty_log_perf_test (with the
full optimization applied).

Before optimization:
--------------------
Iteration 1 clear dirty log time: 3.638543593s
Iteration 2 clear dirty log time: 3.145032742s
Iteration 3 clear dirty log time: 3.142340358s
Clear dirty log over 3 iterations took 9.925916693s. (Avg 3.308638897s/iteration)

After optimization:
-------------------
Iteration 1 clear dirty log time: 2.318988110s
Iteration 2 clear dirty log time: 1.794470164s
Iteration 3 clear dirty log time: 1.791668628s
Clear dirty log over 3 iterations took 5.905126902s. (Avg 1.968375634s/iteration)

Link: https://lore.kernel.org/all/Y9hXmz%2FnDOr1hQal@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: split the switch to atomic-AND to a separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 950c5d23ecee..467931c43968 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -517,7 +517,6 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
  *	    threads that might be modifying SPTEs.
  *
  * Handle bookkeeping that might result from the modification of a SPTE.
- * This function must be called for all TDP SPTE modifications.
  */
 static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 				  u64 old_spte, u64 new_spte, int level,
@@ -1689,8 +1688,10 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
 							iter.old_spte, dbit,
 							iter.level);
 
-		__handle_changed_spte(kvm, iter.as_id, iter.gfn, iter.old_spte,
-				      iter.old_spte & ~dbit, iter.level, false);
+		trace_kvm_tdp_mmu_spte_changed(iter.as_id, iter.gfn, iter.level,
+					       iter.old_spte,
+					       iter.old_spte & ~dbit);
+		kvm_set_pfn_dirty(spte_to_pfn(iter.old_spte));
 	}
 
 	rcu_read_unlock();
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 07/13] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (5 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 08/13] KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs Sean Christopherson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Remove bool parameter "record_dirty_log" from __tdp_mmu_set_spte() and
refactor the code as this variable is always set to true by its caller.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 467931c43968..3cc81fa22b7f 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -708,18 +708,13 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
  *		      notifier for access tracking. Leaving record_acc_track
  *		      unset in that case prevents page accesses from being
  *		      double counted.
- * @record_dirty_log: Record the page as dirty in the dirty bitmap if
- *		      appropriate for the change being made. Should be set
- *		      unless performing certain dirty logging operations.
- *		      Leaving record_dirty_log unset in that case prevents page
- *		      writes from being double counted.
  *
  * Returns the old SPTE value, which _may_ be different than @old_spte if the
  * SPTE had voldatile bits.
  */
 static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 			      u64 old_spte, u64 new_spte, gfn_t gfn, int level,
-			      bool record_acc_track, bool record_dirty_log)
+			      bool record_acc_track)
 {
 	lockdep_assert_held_write(&kvm->mmu_lock);
 
@@ -738,35 +733,34 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	if (record_acc_track)
 		handle_changed_spte_acc_track(old_spte, new_spte, level);
-	if (record_dirty_log)
-		handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
-					      new_spte, level);
+
+	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
+				      level);
 	return old_spte;
 }
 
 static inline void _tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				     u64 new_spte, bool record_acc_track,
-				     bool record_dirty_log)
+				     u64 new_spte, bool record_acc_track)
 {
 	WARN_ON_ONCE(iter->yielded);
 
 	iter->old_spte = __tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
 					    iter->old_spte, new_spte,
 					    iter->gfn, iter->level,
-					    record_acc_track, record_dirty_log);
+					    record_acc_track);
 }
 
 static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
 				    u64 new_spte)
 {
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true, true);
+	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
 }
 
 static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
 						 struct tdp_iter *iter,
 						 u64 new_spte)
 {
-	_tdp_mmu_set_spte(kvm, iter, new_spte, false, true);
+	_tdp_mmu_set_spte(kvm, iter, new_spte, false);
 }
 
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
@@ -916,7 +910,7 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 		return false;
 
 	__tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
-			   sp->gfn, sp->role.level + 1, true, true);
+			   sp->gfn, sp->role.level + 1, true);
 
 	return true;
 }
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 08/13] KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (6 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 07/13] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 09/13] KVM: x86/mmu: Drop unnecessary dirty log checks " Sean Christopherson
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Use tdp_mmu_clear_spte_bits() when clearing the Accessed bit in TDP MMU
SPTEs so as to use an atomic-AND instead of XCHG to clear the A-bit.
Similar to the D-bit story, this will allow KVM to bypass
__handle_changed_spte() by ensuring only the A-bit is modified.

Link: https://lore.kernel.org/all/Y9HcHRBShQgjxsQb@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 3cc81fa22b7f..adbdfed287cc 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -756,13 +756,6 @@ static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
 	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
 }
 
-static inline void tdp_mmu_set_spte_no_acc_track(struct kvm *kvm,
-						 struct tdp_iter *iter,
-						 u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, false);
-}
-
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
 	for_each_tdp_pte(_iter, _root, _start, _end)
 
@@ -1248,33 +1241,44 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
 /*
  * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero
  * if any of the GFNs in the range have been accessed.
+ *
+ * No need to mark the corresponding PFN as accessed as this call is coming
+ * from the clear_young() or clear_flush_young() notifier, which uses the
+ * return value to determine if the page has been accessed.
  */
 static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
 			  struct kvm_gfn_range *range)
 {
-	u64 new_spte = 0;
+	u64 new_spte;
 
 	/* If we have a non-accessed entry we don't need to change the pte. */
 	if (!is_accessed_spte(iter->old_spte))
 		return false;
 
-	new_spte = iter->old_spte;
-
-	if (spte_ad_enabled(new_spte)) {
-		new_spte &= ~shadow_accessed_mask;
+	if (spte_ad_enabled(iter->old_spte)) {
+		iter->old_spte = tdp_mmu_clear_spte_bits(iter->sptep,
+							 iter->old_spte,
+							 shadow_accessed_mask,
+							 iter->level);
+		new_spte = iter->old_spte & ~shadow_accessed_mask;
 	} else {
 		/*
 		 * Capture the dirty status of the page, so that it doesn't get
 		 * lost when the SPTE is marked for access tracking.
 		 */
-		if (is_writable_pte(new_spte))
-			kvm_set_pfn_dirty(spte_to_pfn(new_spte));
+		if (is_writable_pte(iter->old_spte))
+			kvm_set_pfn_dirty(spte_to_pfn(iter->old_spte));
 
-		new_spte = mark_spte_for_access_track(new_spte);
+		new_spte = mark_spte_for_access_track(iter->old_spte);
+		iter->old_spte = kvm_tdp_mmu_write_spte(iter->sptep,
+							iter->old_spte, new_spte,
+							iter->level);
 	}
 
-	tdp_mmu_set_spte_no_acc_track(kvm, iter, new_spte);
-
+	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
+			      new_spte, iter->level, false);
+	handle_changed_spte_dirty_log(kvm, iter->as_id, iter->gfn,
+				      iter->old_spte, new_spte, iter->level);
 	return true;
 }
 
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 09/13] KVM: x86/mmu: Drop unnecessary dirty log checks when aging TDP MMU SPTEs
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (7 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 08/13] KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 10/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Drop the unnecessary call to handle dirty log updates when aging TDP MMU
SPTEs, as neither clearing the Accessed bit nor marking a SPTE for access
tracking can _set_ the Writable bit, i.e. can't trigger marking a gfn
dirty in its memslot.  The access tracking path can _clear_ the Writable
bit, e.g. if the XCHG races with fast_page_fault() and writes the stale
value without the Writable bit set, but clearing the Writable bit outside
of mmu_lock is not allowed, i.e. access tracking can't spuriously set the
Writable bit.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
[sean: split to separate patch, apply to dirty path, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index adbdfed287cc..29bb97ff266e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1277,8 +1277,6 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
 
 	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
 			      new_spte, iter->level, false);
-	handle_changed_spte_dirty_log(kvm, iter->as_id, iter->gfn,
-				      iter->old_spte, new_spte, iter->level);
 	return true;
 }
 
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 10/13] KVM: x86/mmu: Bypass __handle_changed_spte() when aging TDP MMU SPTEs
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (8 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 09/13] KVM: x86/mmu: Drop unnecessary dirty log checks " Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 11/13] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Sean Christopherson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Drop everything except the "tdp_mmu_spte_changed" tracepoint part of
__handle_changed_spte() when aging SPTEs in the TDP MMU, as clearing the
accessed status doesn't affect the SPTE's shadow-present status, whether
or not the SPTE is a leaf, or change the PFN.  I.e. none of the functional
updates handled by __handle_changed_spte() are relevant.

Losing __handle_changed_spte()'s sanity checks does mean that a bug could
theoretical go unnoticed, but that scenario is extremely unlikely, e.g.
would effectively require a misconfigured MMU or a locking bug elsewhere.

Link: https://lore.kernel.org/all/Y9HcHRBShQgjxsQb@google.com
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 29bb97ff266e..cdfb67ef5800 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1275,8 +1275,8 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
 							iter->level);
 	}
 
-	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
-			      new_spte, iter->level, false);
+	trace_kvm_tdp_mmu_spte_changed(iter->as_id, iter->gfn, iter->level,
+				       iter->old_spte, new_spte);
 	return true;
 }
 
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 11/13] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (9 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 10/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 12/13] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Sean Christopherson
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Remove bool parameter "record_acc_track" from __tdp_mmu_set_spte() and
refactor the code. This variable is always set to true by its caller.

Remove single and double underscore prefix from tdp_mmu_set_spte()
related APIs:
1. Change __tdp_mmu_set_spte() to tdp_mmu_set_spte()
2. Change _tdp_mmu_set_spte() to tdp_mmu_iter_set_spte()

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 51 +++++++++++++-------------------------
 1 file changed, 17 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index cdfb67ef5800..9649e0fe4302 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -695,7 +695,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
 
 
 /*
- * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
+ * tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
  * @kvm:	      KVM instance
  * @as_id:	      Address space ID, i.e. regular vs. SMM
  * @sptep:	      Pointer to the SPTE
@@ -703,18 +703,12 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
  * @new_spte:	      The new value that will be set for the SPTE
  * @gfn:	      The base GFN that was (or will be) mapped by the SPTE
  * @level:	      The level _containing_ the SPTE (its parent PT's level)
- * @record_acc_track: Notify the MM subsystem of changes to the accessed state
- *		      of the page. Should be set unless handling an MMU
- *		      notifier for access tracking. Leaving record_acc_track
- *		      unset in that case prevents page accesses from being
- *		      double counted.
  *
  * Returns the old SPTE value, which _may_ be different than @old_spte if the
  * SPTE had voldatile bits.
  */
-static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
-			      u64 old_spte, u64 new_spte, gfn_t gfn, int level,
-			      bool record_acc_track)
+static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
+			    u64 old_spte, u64 new_spte, gfn_t gfn, int level)
 {
 	lockdep_assert_held_write(&kvm->mmu_lock);
 
@@ -730,30 +724,19 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 	old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
 
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
-
-	if (record_acc_track)
-		handle_changed_spte_acc_track(old_spte, new_spte, level);
-
+	handle_changed_spte_acc_track(old_spte, new_spte, level);
 	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
 				      level);
 	return old_spte;
 }
 
-static inline void _tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				     u64 new_spte, bool record_acc_track)
+static inline void tdp_mmu_iter_set_spte(struct kvm *kvm, struct tdp_iter *iter,
+					 u64 new_spte)
 {
 	WARN_ON_ONCE(iter->yielded);
-
-	iter->old_spte = __tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
-					    iter->old_spte, new_spte,
-					    iter->gfn, iter->level,
-					    record_acc_track);
-}
-
-static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
-				    u64 new_spte)
-{
-	_tdp_mmu_set_spte(kvm, iter, new_spte, true);
+	iter->old_spte = tdp_mmu_set_spte(kvm, iter->as_id, iter->sptep,
+					  iter->old_spte, new_spte,
+					  iter->gfn, iter->level);
 }
 
 #define tdp_root_for_each_pte(_iter, _root, _start, _end) \
@@ -845,7 +828,7 @@ static void __tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
 			continue;
 
 		if (!shared)
-			tdp_mmu_set_spte(kvm, &iter, 0);
+			tdp_mmu_iter_set_spte(kvm, &iter, 0);
 		else if (tdp_mmu_set_spte_atomic(kvm, &iter, 0))
 			goto retry;
 	}
@@ -902,8 +885,8 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 	if (WARN_ON_ONCE(!is_shadow_present_pte(old_spte)))
 		return false;
 
-	__tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
-			   sp->gfn, sp->role.level + 1, true);
+	tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0,
+			 sp->gfn, sp->role.level + 1);
 
 	return true;
 }
@@ -937,7 +920,7 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root,
 		    !is_last_spte(iter.old_spte, iter.level))
 			continue;
 
-		tdp_mmu_set_spte(kvm, &iter, 0);
+		tdp_mmu_iter_set_spte(kvm, &iter, 0);
 		flush = true;
 	}
 
@@ -1107,7 +1090,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter,
 		if (ret)
 			return ret;
 	} else {
-		tdp_mmu_set_spte(kvm, iter, spte);
+		tdp_mmu_iter_set_spte(kvm, iter, spte);
 	}
 
 	tdp_account_mmu_page(kvm, sp);
@@ -1314,13 +1297,13 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
 	 * invariant that the PFN of a present * leaf SPTE can never change.
 	 * See __handle_changed_spte().
 	 */
-	tdp_mmu_set_spte(kvm, iter, 0);
+	tdp_mmu_iter_set_spte(kvm, iter, 0);
 
 	if (!pte_write(range->pte)) {
 		new_spte = kvm_mmu_changed_pte_notifier_make_spte(iter->old_spte,
 								  pte_pfn(range->pte));
 
-		tdp_mmu_set_spte(kvm, iter, new_spte);
+		tdp_mmu_iter_set_spte(kvm, iter, new_spte);
 	}
 
 	return true;
@@ -1805,7 +1788,7 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root,
 		if (new_spte == iter.old_spte)
 			break;
 
-		tdp_mmu_set_spte(kvm, &iter, new_spte);
+		tdp_mmu_iter_set_spte(kvm, &iter, new_spte);
 		spte_set = true;
 	}
 
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 12/13] KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (10 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 11/13] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-21 22:00 ` [PATCH v4 13/13] KVM: x86/mmu: Merge all handle_changed_pte*() functions Sean Christopherson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Remove handle_changed_spte_dirty_log() as there is no code flow which
sets 4KiB SPTE writable and hit this path. This function marks the page
dirty in a memslot only if new SPTE is 4KiB in size and writable.

Current users of handle_changed_spte_dirty_log() are:
1. set_spte_gfn() - Create only non writable SPTEs.
2. write_protect_gfn() - Change an SPTE to non writable.
3. zap leaf and roots APIs - Everything is 0.
4. handle_removed_pt() - Sets SPTEs to REMOVED_SPTE
5. tdp_mmu_link_sp() - Makes non leaf SPTEs.

There is also no path which creates a writable 4KiB without going
through make_spte() and this functions takes care of marking SPTE dirty
in the memslot if it is PT_WRITABLE.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: add blurb to __handle_changed_spte()'s comment]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 26 +++-----------------------
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 9649e0fe4302..e8ee49b6da5b 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -345,24 +345,6 @@ static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
 		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
 }
 
-static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
-					  u64 old_spte, u64 new_spte, int level)
-{
-	bool pfn_changed;
-	struct kvm_memory_slot *slot;
-
-	if (level > PG_LEVEL_4K)
-		return;
-
-	pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
-
-	if ((!is_writable_pte(old_spte) || pfn_changed) &&
-	    is_writable_pte(new_spte)) {
-		slot = __gfn_to_memslot(__kvm_memslots(kvm, as_id), gfn);
-		mark_page_dirty_in_slot(kvm, slot, gfn);
-	}
-}
-
 static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	kvm_account_pgtable_pages((void *)sp->spt, +1);
@@ -516,7 +498,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
  *	    the MMU lock and the operation must synchronize with other
  *	    threads that might be modifying SPTEs.
  *
- * Handle bookkeeping that might result from the modification of a SPTE.
+ * Handle bookkeeping that might result from the modification of a SPTE.  Note,
+ * dirty logging updates are handled in common code, not here (see make_spte()
+ * and fast_pf_fix_direct_spte()).
  */
 static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 				  u64 old_spte, u64 new_spte, int level,
@@ -613,8 +597,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
 			      shared);
 	handle_changed_spte_acc_track(old_spte, new_spte, level);
-	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
-				      new_spte, level);
 }
 
 /*
@@ -725,8 +707,6 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
 	handle_changed_spte_acc_track(old_spte, new_spte, level);
-	handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte,
-				      level);
 	return old_spte;
 }
 
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 13/13] KVM: x86/mmu: Merge all handle_changed_pte*() functions
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (11 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 12/13] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Sean Christopherson
@ 2023-03-21 22:00 ` Sean Christopherson
  2023-03-31  0:13 ` [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Vipin Sharma
  2023-04-04 23:44 ` Sean Christopherson
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-03-21 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

From: Vipin Sharma <vipinsh@google.com>

Merge __handle_changed_pte() and handle_changed_spte_acc_track() into a
single function, handle_changed_pte(), as the two are always used
together.  Remove the existing handle_changed_pte(), as it's just a
wrapper that calls __handle_changed_pte() and
handle_changed_spte_acc_track().

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 42 +++++++++++---------------------------
 1 file changed, 12 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index e8ee49b6da5b..b2fca11b91ff 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -334,17 +334,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 				u64 old_spte, u64 new_spte, int level,
 				bool shared);
 
-static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level)
-{
-	if (!is_shadow_present_pte(old_spte) || !is_last_spte(old_spte, level))
-		return;
-
-	if (is_accessed_spte(old_spte) &&
-	    (!is_shadow_present_pte(new_spte) || !is_accessed_spte(new_spte) ||
-	     spte_to_pfn(old_spte) != spte_to_pfn(new_spte)))
-		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
-}
-
 static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	kvm_account_pgtable_pages((void *)sp->spt, +1);
@@ -487,7 +476,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
 }
 
 /**
- * __handle_changed_spte - handle bookkeeping associated with an SPTE change
+ * handle_changed_spte - handle bookkeeping associated with an SPTE change
  * @kvm: kvm instance
  * @as_id: the address space of the paging structure the SPTE was a part of
  * @gfn: the base GFN that was mapped by the SPTE
@@ -502,9 +491,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
  * dirty logging updates are handled in common code, not here (see make_spte()
  * and fast_pf_fix_direct_spte()).
  */
-static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
-				  u64 old_spte, u64 new_spte, int level,
-				  bool shared)
+static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
+				u64 old_spte, u64 new_spte, int level,
+				bool shared)
 {
 	bool was_present = is_shadow_present_pte(old_spte);
 	bool is_present = is_shadow_present_pte(new_spte);
@@ -588,15 +577,10 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 	if (was_present && !was_leaf &&
 	    (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
 		handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared);
-}
 
-static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
-				u64 old_spte, u64 new_spte, int level,
-				bool shared)
-{
-	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
-			      shared);
-	handle_changed_spte_acc_track(old_spte, new_spte, level);
+	if (was_leaf && is_accessed_spte(old_spte) &&
+	    (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
+		kvm_set_pfn_accessed(spte_to_pfn(old_spte));
 }
 
 /*
@@ -639,9 +623,8 @@ static inline int tdp_mmu_set_spte_atomic(struct kvm *kvm,
 	if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte))
 		return -EBUSY;
 
-	__handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
-			      new_spte, iter->level, true);
-	handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level);
+	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
+			    new_spte, iter->level, true);
 
 	return 0;
 }
@@ -705,8 +688,7 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
 
 	old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
 
-	__handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
-	handle_changed_spte_acc_track(old_spte, new_spte, level);
+	handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
 	return old_spte;
 }
 
@@ -1275,7 +1257,7 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
 	 * Note, when changing a read-only SPTE, it's not strictly necessary to
 	 * zero the SPTE before setting the new PFN, but doing so preserves the
 	 * invariant that the PFN of a present * leaf SPTE can never change.
-	 * See __handle_changed_spte().
+	 * See handle_changed_spte().
 	 */
 	tdp_mmu_iter_set_spte(kvm, iter, 0);
 
@@ -1300,7 +1282,7 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	/*
 	 * No need to handle the remote TLB flush under RCU protection, the
 	 * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
-	 * shadow page.  See the WARN on pfn_changed in __handle_changed_spte().
+	 * shadow page. See the WARN on pfn_changed in handle_changed_spte().
 	 */
 	return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn);
 }
-- 
2.40.0.rc2.332.ga46443480c-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (12 preceding siblings ...)
  2023-03-21 22:00 ` [PATCH v4 13/13] KVM: x86/mmu: Merge all handle_changed_pte*() functions Sean Christopherson
@ 2023-03-31  0:13 ` Vipin Sharma
  2023-04-04 23:44 ` Sean Christopherson
  14 siblings, 0 replies; 18+ messages in thread
From: Vipin Sharma @ 2023-03-31  0:13 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, David Matlack, Ben Gardon

On Tue, Mar 21, 2023 at 3:00 PM Sean Christopherson <seanjc@google.com> wrote:
>
> This is a massaged version of Vipin's series to optimize clearing dirty
> state in the TDP MMU.  It's basically the same as v3, just spread out over
> more patches.  The only meaningful difference in the end is that
> clear_dirty_gfn_range() also gets similar treatment in handling Dirty vs.
> Writable logic.
>
> Vipin, I'm still planning on applying this for 6.4, but the changes ended
> up being a wee bit bigger than I'm comfortable making on the fly, thus the
> formal posting.
>

Changes look good to me and verified by running perf test also that
the improvement is still similar to previous version. Thanks for the
v4 work.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log
  2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
                   ` (13 preceding siblings ...)
  2023-03-31  0:13 ` [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Vipin Sharma
@ 2023-04-04 23:44 ` Sean Christopherson
  14 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-04-04 23:44 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Vipin Sharma, David Matlack, Ben Gardon

On Tue, 21 Mar 2023 15:00:08 -0700, Sean Christopherson wrote:
> This is a massaged version of Vipin's series to optimize clearing dirty
> state in the TDP MMU.  It's basically the same as v3, just spread out over
> more patches.  The only meaningful difference in the end is that
> clear_dirty_gfn_range() also gets similar treatment in handling Dirty vs.
> Writable logic.
> 
> Vipin, I'm still planning on applying this for 6.4, but the changes ended
> up being a wee bit bigger than I'm comfortable making on the fly, thus the
> formal posting.
> 
> [...]

Applied to kvm-x86 mmu, thanks!

[01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write
        https://github.com/kvm-x86/linux/commit/41e07665f1a6
[02/13] KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need wrprot
        https://github.com/kvm-x86/linux/commit/5982a5392663
[03/13] KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU
        https://github.com/kvm-x86/linux/commit/697c89bed94e
[04/13] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow
        https://github.com/kvm-x86/linux/commit/89c313f20c1e
[05/13] KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty bits
        https://github.com/kvm-x86/linux/commit/cf05e8c7325e
[06/13] KVM: x86/mmu: Bypass __handle_changed_spte() when clearing TDP MMU dirty bits
        https://github.com/kvm-x86/linux/commit/1e0f42985ffa
[07/13] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
        https://github.com/kvm-x86/linux/commit/e73008705d0c
[08/13] KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs
        https://github.com/kvm-x86/linux/commit/7ee131e3a3c3
[09/13] KVM: x86/mmu: Drop unnecessary dirty log checks when aging TDP MMU SPTEs
        https://github.com/kvm-x86/linux/commit/6141df067d04
[10/13] KVM: x86/mmu: Bypass __handle_changed_spte() when aging TDP MMU SPTEs
        https://github.com/kvm-x86/linux/commit/891f11596068
[11/13] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
        https://github.com/kvm-x86/linux/commit/0b7cc2547d53
[12/13] KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
        https://github.com/kvm-x86/linux/commit/1f9973456e80
[13/13] KVM: x86/mmu: Merge all handle_changed_pte*() functions
        https://github.com/kvm-x86/linux/commit/40fa907e5a69

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() when clearing TDP MMU dirty bits
  2023-03-21 22:00 ` [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
@ 2023-06-25  7:44   ` Like Xu
  2023-06-26 21:37     ` Sean Christopherson
  0 siblings, 1 reply; 18+ messages in thread
From: Like Xu @ 2023-06-25  7:44 UTC (permalink / raw)
  To: Sean Christopherson, Vipin Sharma
  Cc: kvm, linux-kernel, David Matlack, Ben Gardon, Paolo Bonzini

On 22/3/2023 6:00 am, Sean Christopherson wrote:
> From: Vipin Sharma <vipinsh@google.com>
> 
> Drop everything except marking the PFN dirty and the relevant tracepoint
> parts of __handle_changed_spte() when clearing the dirty status of gfns in
> the TDP MMU.  Clearing only the Dirty (or Writable) bit doesn't affect
> the SPTEs shadow-present status, whether or not the SPTE is a leaf, or
> change the SPTE's PFN.  I.e. other than marking the PFN dirty, none of the
> functional updates handled by __handle_changed_spte() are relevant.
> 
> Losing __handle_changed_spte()'s sanity checks does mean that a bug could
> theoretical go unnoticed, but that scenario is extremely unlikely, e.g.
> would effectively require a misconfigured or a locking bug elsewhere.
> 
> Opportunistically remove a comment blurb from __handle_changed_spte()
> about all modifications to TDP MMU SPTEs needing to invoke said function,
> that "rule" hasn't been true since fast page fault support was added for
> the TDP MMU (and perhaps even before).
> 
> Tested on a VM (160 vCPUs, 160 GB memory) and found that performance of
> clear dirty log stage improved by ~40% in dirty_log_perf_test (with the
> full optimization applied).
> 
> Before optimization:
> --------------------
> Iteration 1 clear dirty log time: 3.638543593s
> Iteration 2 clear dirty log time: 3.145032742s
> Iteration 3 clear dirty log time: 3.142340358s
> Clear dirty log over 3 iterations took 9.925916693s. (Avg 3.308638897s/iteration)
> 
> After optimization:
> -------------------
> Iteration 1 clear dirty log time: 2.318988110s
> Iteration 2 clear dirty log time: 1.794470164s
> Iteration 3 clear dirty log time: 1.791668628s
> Clear dirty log over 3 iterations took 5.905126902s. (Avg 1.968375634s/iteration)
> 
> Link: https://lore.kernel.org/all/Y9hXmz%2FnDOr1hQal@google.com
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> Reviewed-by: David Matlack <dmatlack@google.com>
> [sean: split the switch to atomic-AND to a separate patch]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 950c5d23ecee..467931c43968 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -517,7 +517,6 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>    *	    threads that might be modifying SPTEs.
>    *
>    * Handle bookkeeping that might result from the modification of a SPTE.
> - * This function must be called for all TDP SPTE modifications.
>    */
>   static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>   				  u64 old_spte, u64 new_spte, int level,
> @@ -1689,8 +1688,10 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
>   							iter.old_spte, dbit,
>   							iter.level);
>   
> -		__handle_changed_spte(kvm, iter.as_id, iter.gfn, iter.old_spte,
> -				      iter.old_spte & ~dbit, iter.level, false);
> +		trace_kvm_tdp_mmu_spte_changed(iter.as_id, iter.gfn, iter.level,

Here the first parameter "kvm" is no longer used in this context.

Please help confirm that for clear_dirty_pt_masked(), should the "struct kvm 
*kvm" parameter
be cleared from the list of incoming parameters ?

> +					       iter.old_spte,
> +					       iter.old_spte & ~dbit);
> +		kvm_set_pfn_dirty(spte_to_pfn(iter.old_spte));
>   	}
>   
>   	rcu_read_unlock();

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() when clearing TDP MMU dirty bits
  2023-06-25  7:44   ` Like Xu
@ 2023-06-26 21:37     ` Sean Christopherson
  0 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2023-06-26 21:37 UTC (permalink / raw)
  To: Like Xu
  Cc: Vipin Sharma, kvm, linux-kernel, David Matlack, Ben Gardon,
	Paolo Bonzini

On Sun, Jun 25, 2023, Like Xu wrote:
> On 22/3/2023 6:00 am, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index 950c5d23ecee..467931c43968 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -517,7 +517,6 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
> >    *	    threads that might be modifying SPTEs.
> >    *
> >    * Handle bookkeeping that might result from the modification of a SPTE.
> > - * This function must be called for all TDP SPTE modifications.
> >    */
> >   static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> >   				  u64 old_spte, u64 new_spte, int level,
> > @@ -1689,8 +1688,10 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
> >   							iter.old_spte, dbit,
> >   							iter.level);
> > -		__handle_changed_spte(kvm, iter.as_id, iter.gfn, iter.old_spte,
> > -				      iter.old_spte & ~dbit, iter.level, false);
> > +		trace_kvm_tdp_mmu_spte_changed(iter.as_id, iter.gfn, iter.level,
> 
> Here the first parameter "kvm" is no longer used in this context.
> 
> Please help confirm that for clear_dirty_pt_masked(), should the "struct kvm
> *kvm" parameter be cleared from the list of incoming parameters ?

Hmm, there's only one caller, so keeping @kvm around "just in case" probably
doesn't make sense, e.g. adding it back so that we could do KVM_BUG_ON() in the
future wouldn't require much churn.

That said, I'm tempted to move the lockdep so that it's more obvious why it's safe
for clear_dirty_pt_masked() to use the non-atomic (for non-volatile SPTEs)
tdp_mmu_clear_spte_bits() helper.  for_each_tdp_mmu_root() does its own lockdep,
so the only "loss" in lockdep coverage is if the list is completely empty.

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 512163d52194..0b4f03bef70e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1600,6 +1600,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
                                                   shadow_dirty_mask;
        struct tdp_iter iter;
 
+       lockdep_assert_held_write(&kvm->mmu_lock);
+
        rcu_read_lock();
 
        tdp_root_for_each_leaf_pte(iter, root, gfn + __ffs(mask),
@@ -1646,7 +1648,6 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm,
 {
        struct kvm_mmu_page *root;
 
-       lockdep_assert_held_write(&kvm->mmu_lock);
        for_each_tdp_mmu_root(kvm, root, slot->as_id)
                clear_dirty_pt_masked(kvm, root, gfn, mask, wrprot);
 }



^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-06-26 21:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 22:00 [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 01/13] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 02/13] KVM: x86/mmu: Use kvm_ad_enabled() to determine if TDP MMU SPTEs need wrprot Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 03/13] KVM: x86/mmu: Consolidate Dirty vs. Writable clearing logic in TDP MMU Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 04/13] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 05/13] KVM: x86/mmu: Drop access tracking checks when clearing TDP MMU dirty bits Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 06/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
2023-06-25  7:44   ` Like Xu
2023-06-26 21:37     ` Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 07/13] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 08/13] KVM: x86/mmu: Clear only A-bit (if enabled) when aging TDP MMU SPTEs Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 09/13] KVM: x86/mmu: Drop unnecessary dirty log checks " Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 10/13] KVM: x86/mmu: Bypass __handle_changed_spte() " Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 11/13] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 12/13] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() Sean Christopherson
2023-03-21 22:00 ` [PATCH v4 13/13] KVM: x86/mmu: Merge all handle_changed_pte*() functions Sean Christopherson
2023-03-31  0:13 ` [PATCH v4 00/13] KVM: x86/mmu: Optimize clear dirty log Vipin Sharma
2023-04-04 23:44 ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).