All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Pass memslot around during page fault handling
@ 2021-08-13 20:34 David Matlack
  2021-08-13 20:34 ` [RFC PATCH 1/6] KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment David Matlack
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

This series avoids kvm_vcpu_gfn_to_memslot() calls during page fault
handling by passing around the memslot in struct kvm_page_fault. This
idea came from Ben Gardon who authored an similar series in Google's
kernel.

This series is an RFC because kvm_vcpu_gfn_to_memslot() calls are
actually quite cheap after commit fe22ed827c5b ("KVM: Cache the last
used slot index per vCPU") since we always hit the cache. However
profiling shows there is still some time (1-2%) spent in
kvm_vcpu_gfn_to_memslot() and that hot instructions are the memory loads
for kvm->memslots[as_id] and slots->used_slots. This series eliminates
this remaining overhead but at the cost of a bit of code churn.

Design
------

We can avoid the cost of kvm_vcpu_gfn_to_memslot() by looking up the
slot once and passing it around. In fact this is quite easy to do now
that KVM passes around struct kvm_page_fault to most of the page fault
handling code.  We can store the slot there without changing most of the
call sites.

The one exception to this is mmu_set_spte, which does not take a
kvm_page_fault since it is also used during spte prefetching. There are
three memslots lookups under mmu_set_spte:

mmu_set_spte
  rmap_add
    kvm_vcpu_gfn_to_memslot
  rmap_recycle
    kvm_vcpu_gfn_to_memslot
  set_spte
    make_spte
      mmu_try_to_unsync_pages
        kvm_page_track_is_active
          kvm_vcpu_gfn_to_memslot

Avoiding these lookups requires plumbing the slot through all of the
above functions. I explored creating a synthetic kvm_page_fault for
prefetching so that kvm_page_fault could be passed to all of these
functions instead, but that resulted in even more code churn.

Patches
-------

Patches 1-2 are small cleanups related to the series.

Patches 3-4 pass the memslot through kvm_page_fault and use it where
kvm_page_fault is already accessible.

Patches 5-6 plumb the memslot down into the guts of mmu_set_spte to
avoid the remaining memslot lookups.

Performance
-----------

I measured the performance using dirty_log_perf_test and taking the
average "Populate memory time" over 10 runs. To help inform whether or
not different parts of this series is worth the code churn I measured
the performance of pages 1-4 and 1-6 separately.

Test                            | tdp_mmu | kvm/queue | Patches 1-4 | Patches 1-6
------------------------------- | ------- | --------- | ----------- | -----------
./dirty_log_perf_test -v64      | Y       | 5.22s     | 5.20s       | 5.20s
./dirty_log_perf_test -v64 -x64 | Y       | 5.23s     | 5.14s       | 5.14s
./dirty_log_perf_test -v64      | N       | 17.14s    | 16.39s      | 15.36s
./dirty_log_perf_test -v64 -x64 | N       | 17.17s    | 16.60s      | 15.31s

This series provides no performance improvement to the tdp_mmu but
improves the legacy MMU page fault handling by about 10%.

David Matlack (6):
  KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment
  KVM: x86/mmu: Fold rmap_recycle into rmap_add
  KVM: x86/mmu: Pass around the memslot in kvm_page_fault
  KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track
  KVM: x86/mmu: Avoid memslot lookup in rmap_add
  KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages

 arch/x86/include/asm/kvm_page_track.h |   4 +-
 arch/x86/kvm/mmu.h                    |   5 +-
 arch/x86/kvm/mmu/mmu.c                | 110 +++++++++-----------------
 arch/x86/kvm/mmu/mmu_internal.h       |   3 +-
 arch/x86/kvm/mmu/page_track.c         |   6 +-
 arch/x86/kvm/mmu/paging_tmpl.h        |  18 ++++-
 arch/x86/kvm/mmu/spte.c               |  11 +--
 arch/x86/kvm/mmu/spte.h               |   9 ++-
 arch/x86/kvm/mmu/tdp_mmu.c            |  12 +--
 9 files changed, 80 insertions(+), 98 deletions(-)

-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/6] KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
@ 2021-08-13 20:34 ` David Matlack
  2021-08-13 20:35 ` [RFC PATCH 2/6] KVM: x86/mmu: Fold rmap_recycle into rmap_add David Matlack
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

try_async_pf was renamed just before the comment was added and it never
got updated.

Fixes: 0ed0de3de2d9 ("KVM: MMU: change try_async_pf() arguments to kvm_page_fault")
Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index ce369a533800..2c726b255fa8 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -158,7 +158,7 @@ struct kvm_page_fault {
 	/* Shifted addr, or result of guest page table walk if addr is a gva.  */
 	gfn_t gfn;
 
-	/* Outputs of try_async_pf.  */
+	/* Outputs of kvm_faultin_pfn.  */
 	kvm_pfn_t pfn;
 	hva_t hva;
 	bool map_writable;
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/6] KVM: x86/mmu: Fold rmap_recycle into rmap_add
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
  2021-08-13 20:34 ` [RFC PATCH 1/6] KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment David Matlack
@ 2021-08-13 20:35 ` David Matlack
  2021-08-13 20:35 ` [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault David Matlack
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

Consolidate rmap_recycle and rmap_add into a single function since they
are only ever called together (and only from one place). This has a nice
side effect of eliminating an extra kvm_vcpu_gfn_to_memslot(). In
addition it makes mmu_set_spte(), which is a very long function, a
little shorter.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 40 ++++++++++++++--------------------------
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a272ccbddfa1..3352312ab1c9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1076,20 +1076,6 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
 	return kvm_mmu_memory_cache_nr_free_objects(mc);
 }
 
-static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
-{
-	struct kvm_memory_slot *slot;
-	struct kvm_mmu_page *sp;
-	struct kvm_rmap_head *rmap_head;
-
-	sp = sptep_to_sp(spte);
-	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
-	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
-	return pte_list_add(vcpu, spte, rmap_head);
-}
-
-
 static void rmap_remove(struct kvm *kvm, u64 *spte)
 {
 	struct kvm_memslots *slots;
@@ -1102,9 +1088,9 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	gfn = kvm_mmu_page_get_gfn(sp, spte - sp->spt);
 
 	/*
-	 * Unlike rmap_add and rmap_recycle, rmap_remove does not run in the
-	 * context of a vCPU so have to determine which memslots to use based
-	 * on context information in sp->role.
+	 * Unlike rmap_add, rmap_remove does not run in the context of a vCPU
+	 * so we have to determine which memslots to use based on context
+	 * information in sp->role.
 	 */
 	slots = kvm_memslots_for_spte_role(kvm, sp->role);
 
@@ -1644,19 +1630,24 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 
 #define RMAP_RECYCLE_THRESHOLD 1000
 
-static void rmap_recycle(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
+static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
 	struct kvm_memory_slot *slot;
-	struct kvm_rmap_head *rmap_head;
 	struct kvm_mmu_page *sp;
+	struct kvm_rmap_head *rmap_head;
+	int rmap_count;
 
 	sp = sptep_to_sp(spte);
+	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
 	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
+	rmap_count = pte_list_add(vcpu, spte, rmap_head);
 
-	kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0));
-	kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn,
-			KVM_PAGES_PER_HPAGE(sp->role.level));
+	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
+		kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0));
+		kvm_flush_remote_tlbs_with_address(
+				vcpu->kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
+	}
 }
 
 bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
@@ -2694,7 +2685,6 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 			bool host_writable)
 {
 	int was_rmapped = 0;
-	int rmap_count;
 	int set_spte_ret;
 	int ret = RET_PF_FIXED;
 	bool flush = false;
@@ -2754,9 +2744,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 
 	if (!was_rmapped) {
 		kvm_update_page_stats(vcpu->kvm, level, 1);
-		rmap_count = rmap_add(vcpu, sptep, gfn);
-		if (rmap_count > RMAP_RECYCLE_THRESHOLD)
-			rmap_recycle(vcpu, sptep, gfn);
+		rmap_add(vcpu, sptep, gfn);
 	}
 
 	return ret;
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
  2021-08-13 20:34 ` [RFC PATCH 1/6] KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment David Matlack
  2021-08-13 20:35 ` [RFC PATCH 2/6] KVM: x86/mmu: Fold rmap_recycle into rmap_add David Matlack
@ 2021-08-13 20:35 ` David Matlack
  2021-08-17 13:00   ` Paolo Bonzini
  2021-08-19 16:37   ` Sean Christopherson
  2021-08-13 20:35 ` [RFC PATCH 4/6] KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track David Matlack
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

The memslot for the faulting gfn is used throughout the page fault
handling code, so capture it in kvm_page_fault as soon as we know the
gfn and use it in the page fault handling code that has direct access
to the kvm_page_fault struct.

This, in combination with the subsequent patch, improves "Populate
memory time" in dirty_log_perf_test by 5% when using the legacy MMU.
There is no discerable improvement to the performance of the TDP MMU.

No functional change intended.

Suggested-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu.h             |  3 +++
 arch/x86/kvm/mmu/mmu.c         | 27 +++++++++------------------
 arch/x86/kvm/mmu/paging_tmpl.h |  2 ++
 arch/x86/kvm/mmu/tdp_mmu.c     | 10 +++++-----
 4 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 2c726b255fa8..8d13333f0345 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -158,6 +158,9 @@ struct kvm_page_fault {
 	/* Shifted addr, or result of guest page table walk if addr is a gva.  */
 	gfn_t gfn;
 
+	/* The memslot containing gfn. May be NULL. */
+	struct kvm_memory_slot *slot;
+
 	/* Outputs of kvm_faultin_pfn.  */
 	kvm_pfn_t pfn;
 	hva_t hva;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3352312ab1c9..fb2c95e8df00 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm,
 
 void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
-	struct kvm_memory_slot *slot;
+	struct kvm_memory_slot *slot = fault->slot;
 	kvm_pfn_t mask;
 
 	fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled;
@@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn))
 		return;
 
-	slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true);
-	if (!slot)
+	if (kvm_slot_dirty_track_enabled(slot))
 		return;
 
 	/*
@@ -3076,13 +3075,9 @@ static bool page_fault_can_be_fast(struct kvm_page_fault *fault)
  * someone else modified the SPTE from its original value.
  */
 static bool
-fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
+fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 			u64 *sptep, u64 old_spte, u64 new_spte)
 {
-	gfn_t gfn;
-
-	WARN_ON(!sp->role.direct);
-
 	/*
 	 * Theoretically we could also set dirty bit (and flush TLB) here in
 	 * order to eliminate unnecessary PML logging. See comments in
@@ -3098,14 +3093,8 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	if (cmpxchg64(sptep, old_spte, new_spte) != old_spte)
 		return false;
 
-	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
-		/*
-		 * The gfn of direct spte is stable since it is
-		 * calculated by sp->gfn.
-		 */
-		gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
-		kvm_vcpu_mark_page_dirty(vcpu, gfn);
-	}
+	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
+		mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn);
 
 	return true;
 }
@@ -3233,7 +3222,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		 * since the gfn is not stable for indirect shadow page. See
 		 * Documentation/virt/kvm/locking.rst to get more detail.
 		 */
-		if (fast_pf_fix_direct_spte(vcpu, sp, sptep, spte, new_spte)) {
+		if (fast_pf_fix_direct_spte(vcpu, fault, sptep, spte, new_spte)) {
 			ret = RET_PF_FIXED;
 			break;
 		}
@@ -3823,7 +3812,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 
 static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int *r)
 {
-	struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn);
+	struct kvm_memory_slot *slot = fault->slot;
 	bool async;
 
 	/*
@@ -3888,6 +3877,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	int r;
 
 	fault->gfn = fault->addr >> PAGE_SHIFT;
+	fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn);
+
 	if (page_fault_handle_page_track(vcpu, fault))
 		return RET_PF_EMULATE;
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index f70afecbf3a2..50ade6450ace 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -847,6 +847,8 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	}
 
 	fault->gfn = walker.gfn;
+	fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn);
+
 	if (page_fault_handle_page_track(vcpu, fault)) {
 		shadow_page_table_clear_flood(vcpu, fault->addr);
 		return RET_PF_EMULATE;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 47ec9f968406..6f733a68d750 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -533,6 +533,7 @@ static inline bool tdp_mmu_set_spte_atomic_no_dirty_log(struct kvm *kvm,
  * TDP page fault.
  *
  * @vcpu: The vcpu instance that took the TDP page fault.
+ * @fault: The kvm_page_fault being resolved by this SPTE.
  * @iter: a tdp_iter instance currently on the SPTE that should be set
  * @new_spte: The value the SPTE should be set to
  *
@@ -540,6 +541,7 @@ static inline bool tdp_mmu_set_spte_atomic_no_dirty_log(struct kvm *kvm,
  *	    this function will have no side-effects.
  */
 static inline bool tdp_mmu_map_set_spte_atomic(struct kvm_vcpu *vcpu,
+					       struct kvm_page_fault *fault,
 					       struct tdp_iter *iter,
 					       u64 new_spte)
 {
@@ -553,12 +555,10 @@ static inline bool tdp_mmu_map_set_spte_atomic(struct kvm_vcpu *vcpu,
 	 * handle_changed_spte_dirty_log() to leverage vcpu->last_used_slot.
 	 */
 	if (is_writable_pte(new_spte)) {
-		struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, iter->gfn);
-
-		if (slot && kvm_slot_dirty_track_enabled(slot)) {
+		if (fault->slot && kvm_slot_dirty_track_enabled(fault->slot)) {
 			/* Enforced by kvm_mmu_hugepage_adjust. */
 			WARN_ON_ONCE(iter->level > PG_LEVEL_4K);
-			mark_page_dirty_in_slot(kvm, slot, iter->gfn);
+			mark_page_dirty_in_slot(kvm, fault->slot, iter->gfn);
 		}
 	}
 
@@ -934,7 +934,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
 
 	if (new_spte == iter->old_spte)
 		ret = RET_PF_SPURIOUS;
-	else if (!tdp_mmu_map_set_spte_atomic(vcpu, iter, new_spte))
+	else if (!tdp_mmu_map_set_spte_atomic(vcpu, fault, iter, new_spte))
 		return RET_PF_RETRY;
 
 	/*
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 4/6] KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
                   ` (2 preceding siblings ...)
  2021-08-13 20:35 ` [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault David Matlack
@ 2021-08-13 20:35 ` David Matlack
  2021-08-13 20:35 ` [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add David Matlack
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

Now that kvm_page_fault has a pointer to the memslot it can be passed
down to the page tracking code to avoid a redundant slot lookup.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/include/asm/kvm_page_track.h |  2 ++
 arch/x86/kvm/mmu/mmu.c                |  2 +-
 arch/x86/kvm/mmu/page_track.c         | 20 +++++++++++++-------
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index 87bd6025d91d..8766adb52a73 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -61,6 +61,8 @@ void kvm_slot_page_track_remove_page(struct kvm *kvm,
 				     enum kvm_page_track_mode mode);
 bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
 			      enum kvm_page_track_mode mode);
+bool kvm_slot_page_track_is_active(struct kvm_memory_slot *slot, gfn_t gfn,
+				   enum kvm_page_track_mode mode);
 
 void
 kvm_page_track_register_notifier(struct kvm *kvm,
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index fb2c95e8df00..c148d481e9b5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3776,7 +3776,7 @@ static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
 	 * guest is writing the page which is write tracked which can
 	 * not be fixed by page fault handler.
 	 */
-	if (kvm_page_track_is_active(vcpu, fault->gfn, KVM_PAGE_TRACK_WRITE))
+	if (kvm_slot_page_track_is_active(fault->slot, fault->gfn, KVM_PAGE_TRACK_WRITE))
 		return true;
 
 	return false;
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index 269f11f92fd0..a9e2e02f2f4f 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -136,19 +136,14 @@ void kvm_slot_page_track_remove_page(struct kvm *kvm,
 }
 EXPORT_SYMBOL_GPL(kvm_slot_page_track_remove_page);
 
-/*
- * check if the corresponding access on the specified guest page is tracked.
- */
-bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
-			      enum kvm_page_track_mode mode)
+bool kvm_slot_page_track_is_active(struct kvm_memory_slot *slot, gfn_t gfn,
+				   enum kvm_page_track_mode mode)
 {
-	struct kvm_memory_slot *slot;
 	int index;
 
 	if (WARN_ON(!page_track_mode_is_valid(mode)))
 		return false;
 
-	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 	if (!slot)
 		return false;
 
@@ -156,6 +151,17 @@ bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
 	return !!READ_ONCE(slot->arch.gfn_track[mode][index]);
 }
 
+/*
+ * check if the corresponding access on the specified guest page is tracked.
+ */
+bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
+			      enum kvm_page_track_mode mode)
+{
+	struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
+
+	return kvm_slot_page_track_is_active(slot, gfn, mode);
+}
+
 void kvm_page_track_cleanup(struct kvm *kvm)
 {
 	struct kvm_page_track_notifier_head *head;
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
                   ` (3 preceding siblings ...)
  2021-08-13 20:35 ` [RFC PATCH 4/6] KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track David Matlack
@ 2021-08-13 20:35 ` David Matlack
  2021-08-17 12:03   ` Paolo Bonzini
  2021-08-13 20:35 ` [RFC PATCH 6/6] KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages David Matlack
  2021-08-17 11:12 ` [RFC PATCH 0/6] Pass memslot around during page fault handling Paolo Bonzini
  6 siblings, 1 reply; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

Avoid the memslot lookup in rmap_add by passing it down from the fault
handling code to mmu_set_spte and then to rmap_add.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 29 ++++++++---------------------
 arch/x86/kvm/mmu/paging_tmpl.h | 12 +++++++++---
 2 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c148d481e9b5..41e2ef8ad09b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1630,16 +1630,15 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 
 #define RMAP_RECYCLE_THRESHOLD 1000
 
-static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
+static void rmap_add(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+		     u64 *spte, gfn_t gfn)
 {
-	struct kvm_memory_slot *slot;
 	struct kvm_mmu_page *sp;
 	struct kvm_rmap_head *rmap_head;
 	int rmap_count;
 
 	sp = sptep_to_sp(spte);
 	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
-	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
 	rmap_count = pte_list_add(vcpu, spte, rmap_head);
 
@@ -2679,9 +2678,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	return ret;
 }
 
-static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			unsigned int pte_access, bool write_fault, int level,
-			gfn_t gfn, kvm_pfn_t pfn, bool speculative,
+static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+			u64 *sptep, unsigned int pte_access, bool write_fault,
+			int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
 			bool host_writable)
 {
 	int was_rmapped = 0;
@@ -2744,24 +2743,12 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 
 	if (!was_rmapped) {
 		kvm_update_page_stats(vcpu->kvm, level, 1);
-		rmap_add(vcpu, sptep, gfn);
+		rmap_add(vcpu, slot, sptep, gfn);
 	}
 
 	return ret;
 }
 
-static kvm_pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
-				     bool no_dirty_log)
-{
-	struct kvm_memory_slot *slot;
-
-	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
-	if (!slot)
-		return KVM_PFN_ERR_FAULT;
-
-	return gfn_to_pfn_memslot_atomic(slot, gfn);
-}
-
 static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 				    struct kvm_mmu_page *sp,
 				    u64 *start, u64 *end)
@@ -2782,7 +2769,7 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 		return -1;
 
 	for (i = 0; i < ret; i++, gfn++, start++) {
-		mmu_set_spte(vcpu, start, access, false, sp->role.level, gfn,
+		mmu_set_spte(vcpu, slot, start, access, false, sp->role.level, gfn,
 			     page_to_pfn(pages[i]), true, true);
 		put_page(pages[i]);
 	}
@@ -2979,7 +2966,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 			account_huge_nx_page(vcpu->kvm, sp);
 	}
 
-	ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
+	ret = mmu_set_spte(vcpu, fault->slot, it.sptep, ACC_ALL,
 			   fault->write, fault->goal_level, base_gfn, fault->pfn,
 			   fault->prefault, fault->map_writable);
 	if (ret == RET_PF_SPURIOUS)
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 50ade6450ace..653ca44afa58 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -561,6 +561,7 @@ static bool
 FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 		     u64 *spte, pt_element_t gpte, bool no_dirty_log)
 {
+	struct kvm_memory_slot *slot;
 	unsigned pte_access;
 	gfn_t gfn;
 	kvm_pfn_t pfn;
@@ -573,8 +574,13 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	gfn = gpte_to_gfn(gpte);
 	pte_access = sp->role.access & FNAME(gpte_access)(gpte);
 	FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte);
-	pfn = pte_prefetch_gfn_to_pfn(vcpu, gfn,
+
+	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn,
 			no_dirty_log && (pte_access & ACC_WRITE_MASK));
+	if (!slot)
+		return false;
+
+	pfn = gfn_to_pfn_memslot_atomic(slot, gfn);
 	if (is_error_pfn(pfn))
 		return false;
 
@@ -582,7 +588,7 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	 * we call mmu_set_spte() with host_writable = true because
 	 * pte_prefetch_gfn_to_pfn always gets a writable pfn.
 	 */
-	mmu_set_spte(vcpu, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
+	mmu_set_spte(vcpu, slot, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
 		     true, true);
 
 	kvm_release_pfn_clean(pfn);
@@ -749,7 +755,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 		}
 	}
 
-	ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, fault->write,
+	ret = mmu_set_spte(vcpu, fault->slot, it.sptep, gw->pte_access, fault->write,
 			   it.level, base_gfn, fault->pfn, fault->prefault,
 			   fault->map_writable);
 	if (ret == RET_PF_SPURIOUS)
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 6/6] KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
                   ` (4 preceding siblings ...)
  2021-08-13 20:35 ` [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add David Matlack
@ 2021-08-13 20:35 ` David Matlack
  2021-08-17 11:12 ` [RFC PATCH 0/6] Pass memslot around during page fault handling Paolo Bonzini
  6 siblings, 0 replies; 18+ messages in thread
From: David Matlack @ 2021-08-13 20:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, David Matlack

mmu_try_to_unsync_pages checks if page tracking is active for the given
gfn, which requires knowing the memslot. We can pass down the memslot
all the way from mmu_set_spte to avoid this lookup.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/include/asm/kvm_page_track.h |  2 --
 arch/x86/kvm/mmu/mmu.c                | 16 +++++++++-------
 arch/x86/kvm/mmu/mmu_internal.h       |  3 ++-
 arch/x86/kvm/mmu/page_track.c         | 14 +++-----------
 arch/x86/kvm/mmu/paging_tmpl.h        |  4 +++-
 arch/x86/kvm/mmu/spte.c               | 11 ++++++-----
 arch/x86/kvm/mmu/spte.h               |  9 +++++----
 arch/x86/kvm/mmu/tdp_mmu.c            |  2 +-
 8 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index 8766adb52a73..be76dda0952b 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -59,8 +59,6 @@ void kvm_slot_page_track_add_page(struct kvm *kvm,
 void kvm_slot_page_track_remove_page(struct kvm *kvm,
 				     struct kvm_memory_slot *slot, gfn_t gfn,
 				     enum kvm_page_track_mode mode);
-bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
-			      enum kvm_page_track_mode mode);
 bool kvm_slot_page_track_is_active(struct kvm_memory_slot *slot, gfn_t gfn,
 				   enum kvm_page_track_mode mode);
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 41e2ef8ad09b..136056b13e15 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2583,7 +2583,8 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
  * were marked unsync (or if there is no shadow page), -EPERM if the SPTE must
  * be write-protected.
  */
-int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
+int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+			    gfn_t gfn, bool can_unsync)
 {
 	struct kvm_mmu_page *sp;
 
@@ -2592,7 +2593,7 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
 	 * track machinery is used to write-protect upper-level shadow pages,
 	 * i.e. this guards the role.level == 4K assertion below!
 	 */
-	if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
+	if (kvm_slot_page_track_is_active(slot, gfn, KVM_PAGE_TRACK_WRITE))
 		return -EPERM;
 
 	/*
@@ -2654,8 +2655,8 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
 	return 0;
 }
 
-static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-		    unsigned int pte_access, int level,
+static int set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+		    u64 *sptep, unsigned int pte_access, int level,
 		    gfn_t gfn, kvm_pfn_t pfn, bool speculative,
 		    bool can_unsync, bool host_writable)
 {
@@ -2665,8 +2666,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 
 	sp = sptep_to_sp(sptep);
 
-	ret = make_spte(vcpu, pte_access, level, gfn, pfn, *sptep, speculative,
-			can_unsync, host_writable, sp_ad_disabled(sp), &spte);
+	ret = make_spte(vcpu, slot, pte_access, level, gfn, pfn, *sptep,
+			speculative, can_unsync, host_writable,
+			sp_ad_disabled(sp), &spte);
 
 	if (spte & PT_WRITABLE_MASK)
 		kvm_vcpu_mark_page_dirty(vcpu, gfn);
@@ -2717,7 +2719,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
 			was_rmapped = 1;
 	}
 
-	set_spte_ret = set_spte(vcpu, sptep, pte_access, level, gfn, pfn,
+	set_spte_ret = set_spte(vcpu, slot, sptep, pte_access, level, gfn, pfn,
 				speculative, true, host_writable);
 	if (set_spte_ret & SET_SPTE_WRITE_PROTECTED_PT) {
 		if (write_fault)
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 658d8d228d43..e0c9c68ff617 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -116,7 +116,8 @@ static inline bool kvm_vcpu_ad_need_write_protect(struct kvm_vcpu *vcpu)
 	       kvm_x86_ops.cpu_dirty_log_size;
 }
 
-int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync);
+int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+			    gfn_t gfn, bool can_unsync);
 
 void kvm_mmu_gfn_disallow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index a9e2e02f2f4f..4179f4712152 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -136,6 +136,9 @@ void kvm_slot_page_track_remove_page(struct kvm *kvm,
 }
 EXPORT_SYMBOL_GPL(kvm_slot_page_track_remove_page);
 
+/*
+ * check if the corresponding access on the specified guest page is tracked.
+ */
 bool kvm_slot_page_track_is_active(struct kvm_memory_slot *slot, gfn_t gfn,
 				   enum kvm_page_track_mode mode)
 {
@@ -151,17 +154,6 @@ bool kvm_slot_page_track_is_active(struct kvm_memory_slot *slot, gfn_t gfn,
 	return !!READ_ONCE(slot->arch.gfn_track[mode][index]);
 }
 
-/*
- * check if the corresponding access on the specified guest page is tracked.
- */
-bool kvm_page_track_is_active(struct kvm_vcpu *vcpu, gfn_t gfn,
-			      enum kvm_page_track_mode mode)
-{
-	struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-
-	return kvm_slot_page_track_is_active(slot, gfn, mode);
-}
-
 void kvm_page_track_cleanup(struct kvm *kvm)
 {
 	struct kvm_page_track_notifier_head *head;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 653ca44afa58..f85786534d86 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -1086,6 +1086,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 	first_pte_gpa = FNAME(get_level1_sp_gpa)(sp);
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
+		struct kvm_memory_slot *slot;
 		unsigned pte_access;
 		pt_element_t gpte;
 		gpa_t pte_gpa;
@@ -1135,7 +1136,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 
 		host_writable = sp->spt[i] & shadow_host_writable_mask;
 
-		set_spte_ret |= set_spte(vcpu, &sp->spt[i],
+		slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
+		set_spte_ret |= set_spte(vcpu, slot, &sp->spt[i],
 					 pte_access, PG_LEVEL_4K,
 					 gfn, spte_to_pfn(sp->spt[i]),
 					 true, false, host_writable);
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 3e97cdb13eb7..5f0c99532a96 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -89,10 +89,11 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
 				     E820_TYPE_RAM);
 }
 
-int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
-		     gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
-		     bool can_unsync, bool host_writable, bool ad_disabled,
-		     u64 *new_spte)
+int make_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+	      unsigned int pte_access, int level,
+	      gfn_t gfn, kvm_pfn_t pfn,
+	      u64 old_spte, bool speculative, bool can_unsync,
+	      bool host_writable, bool ad_disabled, u64 *new_spte)
 {
 	u64 spte = SPTE_MMU_PRESENT_MASK;
 	int ret = 0;
@@ -159,7 +160,7 @@ int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
 		 * e.g. it's write-tracked (upper-level SPs) or has one or more
 		 * shadow pages and unsync'ing pages is not allowed.
 		 */
-		if (mmu_try_to_unsync_pages(vcpu, gfn, can_unsync)) {
+		if (mmu_try_to_unsync_pages(vcpu, slot, gfn, can_unsync)) {
 			pgprintk("%s: found shadow page for %llx, marking ro\n",
 				 __func__, gfn);
 			ret |= SET_SPTE_WRITE_PROTECTED_PT;
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index eb7b227fc6cf..6d2446f4c591 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -339,10 +339,11 @@ static inline u64 get_mmio_spte_generation(u64 spte)
 #define SET_SPTE_NEED_REMOTE_TLB_FLUSH BIT(1)
 #define SET_SPTE_SPURIOUS              BIT(2)
 
-int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
-		     gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
-		     bool can_unsync, bool host_writable, bool ad_disabled,
-		     u64 *new_spte);
+int make_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+	      unsigned int pte_access, int level,
+	      gfn_t gfn, kvm_pfn_t pfn,
+	      u64 old_spte, bool speculative, bool can_unsync,
+	      bool host_writable, bool ad_disabled, u64 *new_spte);
 u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
 u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
 u64 mark_spte_for_access_track(u64 spte);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 6f733a68d750..cd72184327a4 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -927,7 +927,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
 	if (unlikely(is_noslot_pfn(fault->pfn)))
 		new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
 	else
-		make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
+		make_spte_ret = make_spte(vcpu, fault->slot, ACC_ALL, iter->level, iter->gfn,
 					 fault->pfn, iter->old_spte, fault->prefault, true,
 					 fault->map_writable, !shadow_accessed_mask,
 					 &new_spte);
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/6] Pass memslot around during page fault handling
  2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
                   ` (5 preceding siblings ...)
  2021-08-13 20:35 ` [RFC PATCH 6/6] KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages David Matlack
@ 2021-08-17 11:12 ` Paolo Bonzini
  6 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-08-17 11:12 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On 13/08/21 22:34, David Matlack wrote:
> This series avoids kvm_vcpu_gfn_to_memslot() calls during page fault
> handling by passing around the memslot in struct kvm_page_fault. This
> idea came from Ben Gardon who authored an similar series in Google's
> kernel.
> 
> This series is an RFC because kvm_vcpu_gfn_to_memslot() calls are
> actually quite cheap after commit fe22ed827c5b ("KVM: Cache the last
> used slot index per vCPU") since we always hit the cache. However
> profiling shows there is still some time (1-2%) spent in
> kvm_vcpu_gfn_to_memslot() and that hot instructions are the memory loads
> for kvm->memslots[as_id] and slots->used_slots. This series eliminates
> this remaining overhead but at the cost of a bit of code churn.
> 
> Design
> ------
> 
> We can avoid the cost of kvm_vcpu_gfn_to_memslot() by looking up the
> slot once and passing it around. In fact this is quite easy to do now
> that KVM passes around struct kvm_page_fault to most of the page fault
> handling code.  We can store the slot there without changing most of the
> call sites.
> 
> The one exception to this is mmu_set_spte, which does not take a
> kvm_page_fault since it is also used during spte prefetching. There are
> three memslots lookups under mmu_set_spte:
> 
> mmu_set_spte
>    rmap_add
>      kvm_vcpu_gfn_to_memslot
>    rmap_recycle
>      kvm_vcpu_gfn_to_memslot
>    set_spte
>      make_spte
>        mmu_try_to_unsync_pages
>          kvm_page_track_is_active
>            kvm_vcpu_gfn_to_memslot
> 
> Avoiding these lookups requires plumbing the slot through all of the
> above functions. I explored creating a synthetic kvm_page_fault for
> prefetching so that kvm_page_fault could be passed to all of these
> functions instead, but that resulted in even more code churn.
> 
> Patches
> -------
> 
> Patches 1-2 are small cleanups related to the series.
> 
> Patches 3-4 pass the memslot through kvm_page_fault and use it where
> kvm_page_fault is already accessible.
> 
> Patches 5-6 plumb the memslot down into the guts of mmu_set_spte to
> avoid the remaining memslot lookups.
> 
> Performance
> -----------
> 
> I measured the performance using dirty_log_perf_test and taking the
> average "Populate memory time" over 10 runs. To help inform whether or
> not different parts of this series is worth the code churn I measured
> the performance of pages 1-4 and 1-6 separately.
> 
> Test                            | tdp_mmu | kvm/queue | Patches 1-4 | Patches 1-6
> ------------------------------- | ------- | --------- | ----------- | -----------
> ./dirty_log_perf_test -v64      | Y       | 5.22s     | 5.20s       | 5.20s
> ./dirty_log_perf_test -v64 -x64 | Y       | 5.23s     | 5.14s       | 5.14s
> ./dirty_log_perf_test -v64      | N       | 17.14s    | 16.39s      | 15.36s
> ./dirty_log_perf_test -v64 -x64 | N       | 17.17s    | 16.60s      | 15.31s
> 
> This series provides no performance improvement to the tdp_mmu but
> improves the legacy MMU page fault handling by about 10%.
> 
> David Matlack (6):
>    KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment
>    KVM: x86/mmu: Fold rmap_recycle into rmap_add
>    KVM: x86/mmu: Pass around the memslot in kvm_page_fault
>    KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track
>    KVM: x86/mmu: Avoid memslot lookup in rmap_add
>    KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages
> 
>   arch/x86/include/asm/kvm_page_track.h |   4 +-
>   arch/x86/kvm/mmu.h                    |   5 +-
>   arch/x86/kvm/mmu/mmu.c                | 110 +++++++++-----------------
>   arch/x86/kvm/mmu/mmu_internal.h       |   3 +-
>   arch/x86/kvm/mmu/page_track.c         |   6 +-
>   arch/x86/kvm/mmu/paging_tmpl.h        |  18 ++++-
>   arch/x86/kvm/mmu/spte.c               |  11 +--
>   arch/x86/kvm/mmu/spte.h               |   9 ++-
>   arch/x86/kvm/mmu/tdp_mmu.c            |  12 +--
>   9 files changed, 80 insertions(+), 98 deletions(-)
> 

Queued patches 1-3, thanks.  For the others, see the reply to patch 6.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add
  2021-08-13 20:35 ` [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add David Matlack
@ 2021-08-17 12:03   ` Paolo Bonzini
  2021-08-19 16:15     ` David Matlack
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2021-08-17 12:03 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On 13/08/21 22:35, David Matlack wrote:
> Avoid the memslot lookup in rmap_add by passing it down from the fault
> handling code to mmu_set_spte and then to rmap_add.
> 
> No functional change intended.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>

I think before doing this we should take another look at the aguments
for make_spte, set_spte and mmu_set_spte.  St

static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
                         u64 *sptep, unsigned int pte_access, bool write_fault,
                         int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
                         bool host_writable)

static int set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
                     u64 *sptep, unsigned int pte_access, int level,
                     gfn_t gfn, kvm_pfn_t pfn, bool speculative,
                     bool can_unsync, bool host_writable)

int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
                      gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
                      bool can_unsync, bool host_writable, bool ad_disabled,
                      u64 *new_spte)

In particular:

- set_spte should be inlined in its two callers.  The SET_SPTE_*
flags are overkill if both functions can just call make_spte+mmu_spte_update:
mmu_set_spte can check *sptep == spte and return RET_PF_SPURIOUS directly,
while SET_SPTE_NEED_REMOTE_TLB_FLUSH can become just a bool that is
returned by make_spte.

- level and ad_disabled can be replaced by a single pointer to struct
kvm_mmu_page (tdp_mmu does not set ad_disabled in page_role_for_level,
but that's not an issue).

- in mmu_set_spte, write_fault, speculative and host_writable are either
false/true/true (prefetching) or fault->write, fault->prefault,
fault->map_writable (pagefault).  So they can be replaced by a single
struct kvm_page_fault pointer, where NULL means false/true/true.  Then
if set_spte is inlined, the ugly bool arguments only remain in make_spte
(minus ad_disabled).

This does not remove the need for a separate slot parameter,
but at least the balance is that there are no extra arguments to
make_spte (two go, level and ad_disabled; two come, sp and slot).

I've started hacking on the above, but didn't quite finish.  I'll
keep patches 4-6 in my queue, but they'll have to wait for 5.15.

Paolo

> ---
>   arch/x86/kvm/mmu/mmu.c         | 29 ++++++++---------------------
>   arch/x86/kvm/mmu/paging_tmpl.h | 12 +++++++++---
>   2 files changed, 17 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c148d481e9b5..41e2ef8ad09b 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1630,16 +1630,15 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
>   
>   #define RMAP_RECYCLE_THRESHOLD 1000
>   
> -static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
> +static void rmap_add(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> +		     u64 *spte, gfn_t gfn)
>   {
> -	struct kvm_memory_slot *slot;
>   	struct kvm_mmu_page *sp;
>   	struct kvm_rmap_head *rmap_head;
>   	int rmap_count;
>   
>   	sp = sptep_to_sp(spte);
>   	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
> -	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
>   	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
>   	rmap_count = pte_list_add(vcpu, spte, rmap_head);
>   
> @@ -2679,9 +2678,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>   	return ret;
>   }
>   
> -static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> -			unsigned int pte_access, bool write_fault, int level,
> -			gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> +static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> +			u64 *sptep, unsigned int pte_access, bool write_fault,
> +			int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
>   			bool host_writable)
>   {
>   	int was_rmapped = 0;
> @@ -2744,24 +2743,12 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>   
>   	if (!was_rmapped) {
>   		kvm_update_page_stats(vcpu->kvm, level, 1);
> -		rmap_add(vcpu, sptep, gfn);
> +		rmap_add(vcpu, slot, sptep, gfn);
>   	}
>   
>   	return ret;
>   }
>   
> -static kvm_pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
> -				     bool no_dirty_log)
> -{
> -	struct kvm_memory_slot *slot;
> -
> -	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
> -	if (!slot)
> -		return KVM_PFN_ERR_FAULT;
> -
> -	return gfn_to_pfn_memslot_atomic(slot, gfn);
> -}
> -
>   static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
>   				    struct kvm_mmu_page *sp,
>   				    u64 *start, u64 *end)
> @@ -2782,7 +2769,7 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
>   		return -1;
>   
>   	for (i = 0; i < ret; i++, gfn++, start++) {
> -		mmu_set_spte(vcpu, start, access, false, sp->role.level, gfn,
> +		mmu_set_spte(vcpu, slot, start, access, false, sp->role.level, gfn,
>   			     page_to_pfn(pages[i]), true, true);
>   		put_page(pages[i]);
>   	}
> @@ -2979,7 +2966,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>   			account_huge_nx_page(vcpu->kvm, sp);
>   	}
>   
> -	ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
> +	ret = mmu_set_spte(vcpu, fault->slot, it.sptep, ACC_ALL,
>   			   fault->write, fault->goal_level, base_gfn, fault->pfn,
>   			   fault->prefault, fault->map_writable);
>   	if (ret == RET_PF_SPURIOUS)
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 50ade6450ace..653ca44afa58 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -561,6 +561,7 @@ static bool
>   FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>   		     u64 *spte, pt_element_t gpte, bool no_dirty_log)
>   {
> +	struct kvm_memory_slot *slot;
>   	unsigned pte_access;
>   	gfn_t gfn;
>   	kvm_pfn_t pfn;
> @@ -573,8 +574,13 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>   	gfn = gpte_to_gfn(gpte);
>   	pte_access = sp->role.access & FNAME(gpte_access)(gpte);
>   	FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte);
> -	pfn = pte_prefetch_gfn_to_pfn(vcpu, gfn,
> +
> +	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn,
>   			no_dirty_log && (pte_access & ACC_WRITE_MASK));
> +	if (!slot)
> +		return false;
> +
> +	pfn = gfn_to_pfn_memslot_atomic(slot, gfn);
>   	if (is_error_pfn(pfn))
>   		return false;
>   
> @@ -582,7 +588,7 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>   	 * we call mmu_set_spte() with host_writable = true because
>   	 * pte_prefetch_gfn_to_pfn always gets a writable pfn.
>   	 */
> -	mmu_set_spte(vcpu, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
> +	mmu_set_spte(vcpu, slot, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
>   		     true, true);
>   
>   	kvm_release_pfn_clean(pfn);
> @@ -749,7 +755,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
>   		}
>   	}
>   
> -	ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, fault->write,
> +	ret = mmu_set_spte(vcpu, fault->slot, it.sptep, gw->pte_access, fault->write,
>   			   it.level, base_gfn, fault->pfn, fault->prefault,
>   			   fault->map_writable);
>   	if (ret == RET_PF_SPURIOUS)
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-13 20:35 ` [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault David Matlack
@ 2021-08-17 13:00   ` Paolo Bonzini
  2021-08-17 16:13     ` David Matlack
  2021-08-19 16:37   ` Sean Christopherson
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2021-08-17 13:00 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On 13/08/21 22:35, David Matlack wrote:
> -	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
> -		/*
> -		 * The gfn of direct spte is stable since it is
> -		 * calculated by sp->gfn.
> -		 */
> -		gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> -		kvm_vcpu_mark_page_dirty(vcpu, gfn);
> -	}
> +	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
> +		mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn);

Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-17 13:00   ` Paolo Bonzini
@ 2021-08-17 16:13     ` David Matlack
  2021-08-17 17:02       ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: David Matlack @ 2021-08-17 16:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm list, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On Tue, Aug 17, 2021 at 6:00 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 13/08/21 22:35, David Matlack wrote:
> > -     if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
> > -             /*
> > -              * The gfn of direct spte is stable since it is
> > -              * calculated by sp->gfn.
> > -              */
> > -             gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> > -             kvm_vcpu_mark_page_dirty(vcpu, gfn);
> > -     }
> > +     if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
> > +             mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn);
>
> Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot.

What do you mean? kvm_vcpu_mark_page_dirty ultimately just calls
mark_page_dirty_in_slot.

>
> Paolo
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-17 16:13     ` David Matlack
@ 2021-08-17 17:02       ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-08-17 17:02 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm list, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On 17/08/21 18:13, David Matlack wrote:
> On Tue, Aug 17, 2021 at 6:00 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 13/08/21 22:35, David Matlack wrote:
>>> -     if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
>>> -             /*
>>> -              * The gfn of direct spte is stable since it is
>>> -              * calculated by sp->gfn.
>>> -              */
>>> -             gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
>>> -             kvm_vcpu_mark_page_dirty(vcpu, gfn);
>>> -     }
>>> +     if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
>>> +             mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn);
>>
>> Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot.
> 
> What do you mean? kvm_vcpu_mark_page_dirty ultimately just calls
> mark_page_dirty_in_slot.

Yeah, I was thinking of some very old version of the dirty page ring 
buffer patches.  What I wrote makes no sense. :)

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add
  2021-08-17 12:03   ` Paolo Bonzini
@ 2021-08-19 16:15     ` David Matlack
  2021-08-19 16:39       ` Sean Christopherson
  0 siblings, 1 reply; 18+ messages in thread
From: David Matlack @ 2021-08-19 16:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm list, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson

On Tue, Aug 17, 2021 at 5:03 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 13/08/21 22:35, David Matlack wrote:
> > Avoid the memslot lookup in rmap_add by passing it down from the fault
> > handling code to mmu_set_spte and then to rmap_add.
> >
> > No functional change intended.
> >
> > Signed-off-by: David Matlack <dmatlack@google.com>
>
> I think before doing this we should take another look at the aguments
> for make_spte, set_spte and mmu_set_spte.  St
>
> static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
>                          u64 *sptep, unsigned int pte_access, bool write_fault,
>                          int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
>                          bool host_writable)
>
> static int set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
>                      u64 *sptep, unsigned int pte_access, int level,
>                      gfn_t gfn, kvm_pfn_t pfn, bool speculative,
>                      bool can_unsync, bool host_writable)
>
> int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
>                       gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
>                       bool can_unsync, bool host_writable, bool ad_disabled,
>                       u64 *new_spte)
>
> In particular:
>
> - set_spte should be inlined in its two callers.  The SET_SPTE_*
> flags are overkill if both functions can just call make_spte+mmu_spte_update:
> mmu_set_spte can check *sptep == spte and return RET_PF_SPURIOUS directly,
> while SET_SPTE_NEED_REMOTE_TLB_FLUSH can become just a bool that is
> returned by make_spte.
>
> - level and ad_disabled can be replaced by a single pointer to struct
> kvm_mmu_page (tdp_mmu does not set ad_disabled in page_role_for_level,
> but that's not an issue).
>
> - in mmu_set_spte, write_fault, speculative and host_writable are either
> false/true/true (prefetching) or fault->write, fault->prefault,
> fault->map_writable (pagefault).  So they can be replaced by a single
> struct kvm_page_fault pointer, where NULL means false/true/true.  Then
> if set_spte is inlined, the ugly bool arguments only remain in make_spte
> (minus ad_disabled).
>
> This does not remove the need for a separate slot parameter,
> but at least the balance is that there are no extra arguments to
> make_spte (two go, level and ad_disabled; two come, sp and slot).
>
> I've started hacking on the above, but didn't quite finish.  I'll
> keep patches 4-6 in my queue, but they'll have to wait for 5.15.

Ack. 5.15 sounds good. Let me know if you want any helping with testing.

>
> Paolo
>
> > ---
> >   arch/x86/kvm/mmu/mmu.c         | 29 ++++++++---------------------
> >   arch/x86/kvm/mmu/paging_tmpl.h | 12 +++++++++---
> >   2 files changed, 17 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index c148d481e9b5..41e2ef8ad09b 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1630,16 +1630,15 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> >
> >   #define RMAP_RECYCLE_THRESHOLD 1000
> >
> > -static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
> > +static void rmap_add(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> > +                  u64 *spte, gfn_t gfn)
> >   {
> > -     struct kvm_memory_slot *slot;
> >       struct kvm_mmu_page *sp;
> >       struct kvm_rmap_head *rmap_head;
> >       int rmap_count;
> >
> >       sp = sptep_to_sp(spte);
> >       kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
> > -     slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
> >       rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
> >       rmap_count = pte_list_add(vcpu, spte, rmap_head);
> >
> > @@ -2679,9 +2678,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> >       return ret;
> >   }
> >
> > -static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> > -                     unsigned int pte_access, bool write_fault, int level,
> > -                     gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> > +static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> > +                     u64 *sptep, unsigned int pte_access, bool write_fault,
> > +                     int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> >                       bool host_writable)
> >   {
> >       int was_rmapped = 0;
> > @@ -2744,24 +2743,12 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> >
> >       if (!was_rmapped) {
> >               kvm_update_page_stats(vcpu->kvm, level, 1);
> > -             rmap_add(vcpu, sptep, gfn);
> > +             rmap_add(vcpu, slot, sptep, gfn);
> >       }
> >
> >       return ret;
> >   }
> >
> > -static kvm_pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
> > -                                  bool no_dirty_log)
> > -{
> > -     struct kvm_memory_slot *slot;
> > -
> > -     slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
> > -     if (!slot)
> > -             return KVM_PFN_ERR_FAULT;
> > -
> > -     return gfn_to_pfn_memslot_atomic(slot, gfn);
> > -}
> > -
> >   static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
> >                                   struct kvm_mmu_page *sp,
> >                                   u64 *start, u64 *end)
> > @@ -2782,7 +2769,7 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
> >               return -1;
> >
> >       for (i = 0; i < ret; i++, gfn++, start++) {
> > -             mmu_set_spte(vcpu, start, access, false, sp->role.level, gfn,
> > +             mmu_set_spte(vcpu, slot, start, access, false, sp->role.level, gfn,
> >                            page_to_pfn(pages[i]), true, true);
> >               put_page(pages[i]);
> >       }
> > @@ -2979,7 +2966,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >                       account_huge_nx_page(vcpu->kvm, sp);
> >       }
> >
> > -     ret = mmu_set_spte(vcpu, it.sptep, ACC_ALL,
> > +     ret = mmu_set_spte(vcpu, fault->slot, it.sptep, ACC_ALL,
> >                          fault->write, fault->goal_level, base_gfn, fault->pfn,
> >                          fault->prefault, fault->map_writable);
> >       if (ret == RET_PF_SPURIOUS)
> > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > index 50ade6450ace..653ca44afa58 100644
> > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > @@ -561,6 +561,7 @@ static bool
> >   FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> >                    u64 *spte, pt_element_t gpte, bool no_dirty_log)
> >   {
> > +     struct kvm_memory_slot *slot;
> >       unsigned pte_access;
> >       gfn_t gfn;
> >       kvm_pfn_t pfn;
> > @@ -573,8 +574,13 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> >       gfn = gpte_to_gfn(gpte);
> >       pte_access = sp->role.access & FNAME(gpte_access)(gpte);
> >       FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte);
> > -     pfn = pte_prefetch_gfn_to_pfn(vcpu, gfn,
> > +
> > +     slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn,
> >                       no_dirty_log && (pte_access & ACC_WRITE_MASK));
> > +     if (!slot)
> > +             return false;
> > +
> > +     pfn = gfn_to_pfn_memslot_atomic(slot, gfn);
> >       if (is_error_pfn(pfn))
> >               return false;
> >
> > @@ -582,7 +588,7 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> >        * we call mmu_set_spte() with host_writable = true because
> >        * pte_prefetch_gfn_to_pfn always gets a writable pfn.
> >        */
> > -     mmu_set_spte(vcpu, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
> > +     mmu_set_spte(vcpu, slot, spte, pte_access, false, PG_LEVEL_4K, gfn, pfn,
> >                    true, true);
> >
> >       kvm_release_pfn_clean(pfn);
> > @@ -749,7 +755,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
> >               }
> >       }
> >
> > -     ret = mmu_set_spte(vcpu, it.sptep, gw->pte_access, fault->write,
> > +     ret = mmu_set_spte(vcpu, fault->slot, it.sptep, gw->pte_access, fault->write,
> >                          it.level, base_gfn, fault->pfn, fault->prefault,
> >                          fault->map_writable);
> >       if (ret == RET_PF_SPURIOUS)
> >
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-13 20:35 ` [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault David Matlack
  2021-08-17 13:00   ` Paolo Bonzini
@ 2021-08-19 16:37   ` Sean Christopherson
  2021-08-20 22:54     ` David Matlack
  1 sibling, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2021-08-19 16:37 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm, Ben Gardon, Joerg Roedel, Jim Mattson,
	Wanpeng Li, Vitaly Kuznetsov

On Fri, Aug 13, 2021, David Matlack wrote:
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 3352312ab1c9..fb2c95e8df00 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm,
>  
>  void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>  {
> -	struct kvm_memory_slot *slot;
> +	struct kvm_memory_slot *slot = fault->slot;
>  	kvm_pfn_t mask;
>  
>  	fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled;
> @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
>  	if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn))
>  		return;
>  
> -	slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true);
> -	if (!slot)
> +	if (kvm_slot_dirty_track_enabled(slot))

This is unnecessarily obfuscated.  It relies on the is_error_noslot_pfn() to
ensure fault->slot is valid, but the only reason that helper is used is because
it was the most efficient code when slot wasn't available.  IMO, this would be
better:

	if (!slot || kvm_slot_dirty_track_enabled(slot))
		return;

	if (kvm_is_reserved_pfn(fault->pfn))
		return;

On a related topic, a good follow-up to this series would be to pass @fault into
the prefetch helpers, and modify the prefetch logic to re-use fault->slot and
refuse to prefetch across memslot boundaries.  That would eliminate all users of
gfn_to_memslot_dirty_bitmap() and allow us to drop that abomination.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add
  2021-08-19 16:15     ` David Matlack
@ 2021-08-19 16:39       ` Sean Christopherson
  2021-08-19 16:47         ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2021-08-19 16:39 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm list, Ben Gardon, Joerg Roedel, Jim Mattson,
	Wanpeng Li, Vitaly Kuznetsov

On Thu, Aug 19, 2021, David Matlack wrote:
> On Tue, Aug 17, 2021 at 5:03 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > I've started hacking on the above, but didn't quite finish.  I'll
> > keep patches 4-6 in my queue, but they'll have to wait for 5.15.
> 
> Ack. 5.15 sounds good. Let me know if you want any helping with testing.

Paolo, I assume you meant 5.16?  Or is my math worse than usual?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add
  2021-08-19 16:39       ` Sean Christopherson
@ 2021-08-19 16:47         ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2021-08-19 16:47 UTC (permalink / raw)
  To: Sean Christopherson, David Matlack
  Cc: kvm list, Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov

On 19/08/21 18:39, Sean Christopherson wrote:
> On Thu, Aug 19, 2021, David Matlack wrote:
>> On Tue, Aug 17, 2021 at 5:03 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> I've started hacking on the above, but didn't quite finish.  I'll
>>> keep patches 4-6 in my queue, but they'll have to wait for 5.15.
>>
>> Ack. 5.15 sounds good. Let me know if you want any helping with testing.
> 
> Paolo, I assume you meant 5.16?  Or is my math worse than usual?

Yeah, of course.  In fact, kvm/queue is mostly final at this point, 
unless something bad happens in the next 2-3 hours of testing, and 
commit 5a4bfabcc865 will be my first pull request to Linus for 5.15.

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-19 16:37   ` Sean Christopherson
@ 2021-08-20 22:54     ` David Matlack
  2021-08-20 23:02       ` Sean Christopherson
  0 siblings, 1 reply; 18+ messages in thread
From: David Matlack @ 2021-08-20 22:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm list, Ben Gardon, Joerg Roedel, Jim Mattson,
	Wanpeng Li, Vitaly Kuznetsov

On Thu, Aug 19, 2021 at 9:37 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Aug 13, 2021, David Matlack wrote:
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 3352312ab1c9..fb2c95e8df00 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm,
> >
> >  void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >  {
> > -     struct kvm_memory_slot *slot;
> > +     struct kvm_memory_slot *slot = fault->slot;
> >       kvm_pfn_t mask;
> >
> >       fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled;
> > @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> >       if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn))
> >               return;
> >
> > -     slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true);
> > -     if (!slot)
> > +     if (kvm_slot_dirty_track_enabled(slot))
>
> This is unnecessarily obfuscated.

Ugh. It's pure luck too. I meant to check if the slot is null here.

> It relies on the is_error_noslot_pfn() to
> ensure fault->slot is valid, but the only reason that helper is used is because
> it was the most efficient code when slot wasn't available.  IMO, this would be
> better:
>
>         if (!slot || kvm_slot_dirty_track_enabled(slot))
>                 return;
>
>         if (kvm_is_reserved_pfn(fault->pfn))
>                 return;

That looks reasonable to me. I can send a patch next week with this change.

>
> On a related topic, a good follow-up to this series would be to pass @fault into
> the prefetch helpers, and modify the prefetch logic to re-use fault->slot and
> refuse to prefetch across memslot boundaries.  That would eliminate all users of
> gfn_to_memslot_dirty_bitmap() and allow us to drop that abomination.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault
  2021-08-20 22:54     ` David Matlack
@ 2021-08-20 23:02       ` Sean Christopherson
  0 siblings, 0 replies; 18+ messages in thread
From: Sean Christopherson @ 2021-08-20 23:02 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm list, Ben Gardon, Joerg Roedel, Jim Mattson,
	Wanpeng Li, Vitaly Kuznetsov

On Fri, Aug 20, 2021, David Matlack wrote:
> On Thu, Aug 19, 2021 at 9:37 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Aug 13, 2021, David Matlack wrote:
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 3352312ab1c9..fb2c95e8df00 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm,
> > >
> > >  void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> > >  {
> > > -     struct kvm_memory_slot *slot;
> > > +     struct kvm_memory_slot *slot = fault->slot;
> > >       kvm_pfn_t mask;
> > >
> > >       fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled;
> > > @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> > >       if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn))
> > >               return;
> > >
> > > -     slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true);
> > > -     if (!slot)
> > > +     if (kvm_slot_dirty_track_enabled(slot))
> >
> > This is unnecessarily obfuscated.
> 
> Ugh. It's pure luck too. I meant to check if the slot is null here.

Ha, better to be lucky than good ;-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-08-20 23:02 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 20:34 [RFC PATCH 0/6] Pass memslot around during page fault handling David Matlack
2021-08-13 20:34 ` [RFC PATCH 1/6] KVM: x86/mmu: Rename try_async_pf to kvm_faultin_pfn in comment David Matlack
2021-08-13 20:35 ` [RFC PATCH 2/6] KVM: x86/mmu: Fold rmap_recycle into rmap_add David Matlack
2021-08-13 20:35 ` [RFC PATCH 3/6] KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault David Matlack
2021-08-17 13:00   ` Paolo Bonzini
2021-08-17 16:13     ` David Matlack
2021-08-17 17:02       ` Paolo Bonzini
2021-08-19 16:37   ` Sean Christopherson
2021-08-20 22:54     ` David Matlack
2021-08-20 23:02       ` Sean Christopherson
2021-08-13 20:35 ` [RFC PATCH 4/6] KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track David Matlack
2021-08-13 20:35 ` [RFC PATCH 5/6] KVM: x86/mmu: Avoid memslot lookup in rmap_add David Matlack
2021-08-17 12:03   ` Paolo Bonzini
2021-08-19 16:15     ` David Matlack
2021-08-19 16:39       ` Sean Christopherson
2021-08-19 16:47         ` Paolo Bonzini
2021-08-13 20:35 ` [RFC PATCH 6/6] KVM: x86/mmu: Avoid memslot lookup in mmu_try_to_unsync_pages David Matlack
2021-08-17 11:12 ` [RFC PATCH 0/6] Pass memslot around during page fault handling Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.