kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU
@ 2023-02-02 18:27 Ben Gardon
  2023-02-02 18:27 ` [PATCH 01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code Ben Gardon
                   ` (21 more replies)
  0 siblings, 22 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
implemented in separate files, with a defined interface to common code.

When the TDP (Two Dimensional Paging) MMU was added to x86 KVM, it came in
a separate file with a (reasonably) clear interface. This lead to many
points in the KVM MMU like this:

if (tdp_mmu_on())
	kvm_tdp_mmu_do_stuff()

if (memslots_have_rmaps())
	/* Do whatever was being done before */

The implementations of various functions which preceded the TDP MMU have
remained scattered around mmu.c with no clear identity or interface. Over the
last couple years, the KVM x86 community has settled on calling the KVM MMU
implementation which preceded the TDP MMU the "Shadow MMU", as it grew
from shadow paging, which supported virtualization on hardware pre-TDP.
(Note that the Shadow MMU can also build TDP page tables, and doesn't
only do shadow paging, so the meaning is a bit overloaded.)

Splitting it out into separate files will give a clear interface and make it
easier to distinguish common x86 MMU code from the code specific to the two
MMU implementations.

Patches 1-3 are cleanups from Sean

Patches 4-6 prepare for the refactor by adding files and exporting
functions.

Patch 7 the big move, transferring 3.5K lines from mmu.c to
shadow_mmu.c
(It may be best if whoever ends up preparing the pull request with
this patch just dumps my version and re-does the move so that no code is
lost.)

Patches 8 and 9 move the includes for paging_tmpl.h to shadow_mmu.c

Patches 10-17 clean up the interface between the Shadow MMU and
common MMU code.

The last few patches are in response to feedback on the RFC and move
additional code to the Shadow MMU.

Patch 7 is an enormous change, and doing it all at once in a single
commit all but guarantees merge conflicts and makes it hard to review. I
don't have a good answer to this problem as there's no easy way to move
3.5K lines between files. I tried moving the code bit-by-bit but the
intermediate steps added complexity and ultimately the 50+ patches it
created didn't seem any easier to review.
Doing the big move all at once at least makes it easier to get past when
doing Git archeology, and doing it at the beginning of the series allows the
rest of the commits to still show up in Git blame.

I've tested this series on an Intel Skylake host with kvm-unit-tests and
selftests.

RFC -> v1:
 - RFC: https://lore.kernel.org/all/20221221222418.3307832-1-bgardon@google.com/
 - Moved some more Shadow MMU content to shadow_mmu.c. David Matlack
   pointed out some code I'd missed in the first pass. Added commits
   to the end of the series to achieve this.
 - Dropped is_cpuid_PSE36 and moved all the BUILD_MMU_ROLE*() macros
   to mmu_internal, also as suggested by David Matlack.
 - Added copyright comments to the tops of shadow_mmu.c and .h
 - Tacked some cleanups from Sean onto the beginning of the series.

Ben Gardon (18):
  KVM: x86/MMU: Add shadow_mmu.(c|h)
  KVM: x86/MMU: Expose functions for the Shadow MMU
  KVM: x86/mmu: Get rid of is_cpuid_PSE36()
  KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c
  KVM: x86/MMU: Expose functions for paging_tmpl.h
  KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
  KVM: x86/MMU: Clean up Shadow MMU exports
  KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
  KVM: x86/MMU: Clean up naming of exported Shadow MMU functions
  KVM: x86/MMU: Fix naming on prepare / commit zap page functions
  KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c
  KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
  KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
  KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in
    shadow_mmu.h
  KVM: x86/mmu: Move split cache topup functions to shadow_mmu.c
  KVM: x86/mmu: Move Shadow MMU part of kvm_mmu_zap_all() to
    shadow_mmu.h
  KVM: x86/mmu: Move Shadow MMU init/teardown to shadow_mmu.c
  KVM: x86/mmu: Split out Shadow MMU lockless walk begin/end

Sean Christopherson (3):
  KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up
    code
  KVM: x86/mmu: Replace comment with an actual lockdep assertion on
    mmu_lock
  KVM: x86/mmu: Clean up mmu.c functions that put return type on
    separate line

 arch/x86/kvm/Makefile           |    2 +-
 arch/x86/kvm/debugfs.c          |    1 +
 arch/x86/kvm/mmu/mmu.c          | 4834 ++++---------------------------
 arch/x86/kvm/mmu/mmu_internal.h |   87 +-
 arch/x86/kvm/mmu/paging_tmpl.h  |   15 +-
 arch/x86/kvm/mmu/shadow_mmu.c   | 3692 +++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h   |  132 +
 7 files changed, 4498 insertions(+), 4265 deletions(-)
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.c
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.h


base-commit: 7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 02/21] KVM: x86/mmu: Replace comment with an actual lockdep assertion on mmu_lock Ben Gardon
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

From: Sean Christopherson <seanjc@google.com>

Replace "slot_handle_level" with "walk_slot_rmaps" to better capture what
the helpers are doing, and to slightly shorten the function names so that
each function's return type and attributes can be placed on the same line
as the function declaration.

No functional change intended.

Link: https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 66 +++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index aeb240b339f54..09a0a2cc76bae 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5801,23 +5801,24 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
 EXPORT_SYMBOL_GPL(kvm_configure_mmu);
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm,
+typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
 				    struct kvm_rmap_head *rmap_head,
 				    const struct kvm_memory_slot *slot);
 
 /* The caller should hold mmu-lock before calling this function. */
-static __always_inline bool
-slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
-			slot_level_handler fn, int start_level, int end_level,
-			gfn_t start_gfn, gfn_t end_gfn, bool flush_on_yield,
-			bool flush)
+static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
+					      const struct kvm_memory_slot *slot,
+					      slot_rmaps_handler fn,
+					      int start_level, int end_level,
+					      gfn_t start_gfn, gfn_t end_gfn,
+					      bool flush_on_yield, bool flush)
 {
 	struct slot_rmap_walk_iterator iterator;
 
-	for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
+	for_each_slot_rmap_range(slot, start_level, end_level, start_gfn,
 			end_gfn, &iterator) {
 		if (iterator.rmap)
-			flush |= fn(kvm, iterator.rmap, memslot);
+			flush |= fn(kvm, iterator.rmap, slot);
 
 		if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
 			if (flush && flush_on_yield) {
@@ -5833,23 +5834,23 @@ slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
 	return flush;
 }
 
-static __always_inline bool
-slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot,
-		  slot_level_handler fn, int start_level, int end_level,
-		  bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps(struct kvm *kvm,
+					    const struct kvm_memory_slot *slot,
+					    slot_rmaps_handler fn,
+					    int start_level, int end_level,
+					    bool flush_on_yield)
 {
-	return slot_handle_level_range(kvm, memslot, fn, start_level,
-			end_level, memslot->base_gfn,
-			memslot->base_gfn + memslot->npages - 1,
-			flush_on_yield, false);
+	return __walk_slot_rmaps(kvm, slot, fn, start_level, end_level,
+				 slot->base_gfn, slot->base_gfn + slot->npages - 1,
+				 flush_on_yield, false);
 }
 
-static __always_inline bool
-slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot,
-		     slot_level_handler fn, bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps_4k(struct kvm *kvm,
+					       const struct kvm_memory_slot *slot,
+					       slot_rmaps_handler fn,
+					       bool flush_on_yield)
 {
-	return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K,
-				 PG_LEVEL_4K, flush_on_yield);
+	return walk_slot_rmaps(kvm, slot, fn, PG_LEVEL_4K, PG_LEVEL_4K, flush_on_yield);
 }
 
 static void free_mmu_pages(struct kvm_mmu *mmu)
@@ -6144,9 +6145,9 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e
 			if (WARN_ON_ONCE(start >= end))
 				continue;
 
-			flush = slot_handle_level_range(kvm, memslot, __kvm_zap_rmap,
-							PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
-							start, end - 1, true, flush);
+			flush = __walk_slot_rmaps(kvm, memslot, __kvm_zap_rmap,
+						  PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
+						  start, end - 1, true, flush);
 		}
 	}
 
@@ -6199,8 +6200,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		slot_handle_level(kvm, memslot, slot_rmap_write_protect,
-				  start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+		walk_slot_rmaps(kvm, memslot, slot_rmap_write_protect,
+				start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
 		write_unlock(&kvm->mmu_lock);
 	}
 
@@ -6435,10 +6436,9 @@ static void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
 	 * all the way to the target level. There's no need to split pages
 	 * already at the target level.
 	 */
-	for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) {
-		slot_handle_level_range(kvm, slot, shadow_mmu_try_split_huge_pages,
-					level, level, start, end - 1, true, false);
-	}
+	for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--)
+		__walk_slot_rmaps(kvm, slot, shadow_mmu_try_split_huge_pages,
+				  level, level, start, end - 1, true, false);
 }
 
 /* Must be called with the mmu_lock held in write-mode. */
@@ -6537,8 +6537,8 @@ static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 	 * Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap
 	 * pages that are already mapped at the maximum hugepage level.
 	 */
-	if (slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
-			      PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1, true))
+	if (walk_slot_rmaps(kvm, slot, kvm_mmu_zap_collapsible_spte,
+			    PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1, true))
 		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
 }
 
@@ -6582,7 +6582,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 		 * Clear dirty bits only on 4k SPTEs since the legacy MMU only
 		 * support dirty logging at a 4k granularity.
 		 */
-		slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false);
+		walk_slot_rmaps_4k(kvm, memslot, __rmap_clear_dirty, false);
 		write_unlock(&kvm->mmu_lock);
 	}
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/21] KVM: x86/mmu: Replace comment with an actual lockdep assertion on mmu_lock
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
  2023-02-02 18:27 ` [PATCH 01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 03/21] KVM: x86/mmu: Clean up mmu.c functions that put return type on separate line Ben Gardon
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

From: Sean Christopherson <seanjc@google.com>

Assert that mmu_lock is held for write in __walk_slot_rmaps() instead of
hoping the function comment will magically prevent introducing bugs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 09a0a2cc76bae..2ea8e58f83256 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5805,7 +5805,6 @@ typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
 				    struct kvm_rmap_head *rmap_head,
 				    const struct kvm_memory_slot *slot);
 
-/* The caller should hold mmu-lock before calling this function. */
 static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
 					      const struct kvm_memory_slot *slot,
 					      slot_rmaps_handler fn,
@@ -5815,6 +5814,8 @@ static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
 {
 	struct slot_rmap_walk_iterator iterator;
 
+	lockdep_assert_held_write(&kvm->mmu_lock);
+
 	for_each_slot_rmap_range(slot, start_level, end_level, start_gfn,
 			end_gfn, &iterator) {
 		if (iterator.rmap)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/21] KVM: x86/mmu: Clean up mmu.c functions that put return type on separate line
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
  2023-02-02 18:27 ` [PATCH 01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code Ben Gardon
  2023-02-02 18:27 ` [PATCH 02/21] KVM: x86/mmu: Replace comment with an actual lockdep assertion on mmu_lock Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 04/21] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

From: Sean Christopherson <seanjc@google.com>

Adjust a variety of functions in mmu.c to put the function return type on
the same line as the function declaration.  As stated in the Linus
specification:

  But the "on their own line" is complete garbage to begin with. That
  will NEVER be a kernel rule. We should never have a rule that assumes
  things are so long that they need to be on multiple lines.

  We don't put function return types on their own lines either, even if
  some other projects have that rule (just to get function names at the
  beginning of lines or some other odd reason).

Leave the functions generated by BUILD_MMU_ROLE_REGS_ACCESSOR() as-is,
that code is basically illegible no matter how it's formatted.

No functional change intended.

Link: https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 59 ++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2ea8e58f83256..3674bde2203b2 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -876,9 +876,9 @@ static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 	untrack_possible_nx_huge_page(kvm, sp);
 }
 
-static struct kvm_memory_slot *
-gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn,
-			    bool no_dirty_log)
+static struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+							   gfn_t gfn,
+							   bool no_dirty_log)
 {
 	struct kvm_memory_slot *slot;
 
@@ -938,10 +938,9 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte,
 	return count;
 }
 
-static void
-pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
-			   struct pte_list_desc *desc, int i,
-			   struct pte_list_desc *prev_desc)
+static void pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
+				       struct pte_list_desc *desc, int i,
+				       struct pte_list_desc *prev_desc)
 {
 	int j = desc->spte_count - 1;
 
@@ -1493,8 +1492,8 @@ struct slot_rmap_walk_iterator {
 	struct kvm_rmap_head *end_rmap;
 };
 
-static void
-rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator, int level)
+static void rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator,
+				 int level)
 {
 	iterator->level = level;
 	iterator->gfn = iterator->start_gfn;
@@ -1502,10 +1501,10 @@ rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator, int level)
 	iterator->end_rmap = gfn_to_rmap(iterator->end_gfn, level, iterator->slot);
 }
 
-static void
-slot_rmap_walk_init(struct slot_rmap_walk_iterator *iterator,
-		    const struct kvm_memory_slot *slot, int start_level,
-		    int end_level, gfn_t start_gfn, gfn_t end_gfn)
+static void slot_rmap_walk_init(struct slot_rmap_walk_iterator *iterator,
+				const struct kvm_memory_slot *slot,
+				int start_level, int end_level,
+				gfn_t start_gfn, gfn_t end_gfn)
 {
 	iterator->slot = slot;
 	iterator->start_level = start_level;
@@ -3295,9 +3294,9 @@ static bool page_fault_can_be_fast(struct kvm_page_fault *fault)
  * Returns true if the SPTE was fixed successfully. Otherwise,
  * someone else modified the SPTE from its original value.
  */
-static bool
-fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
-			u64 *sptep, u64 old_spte, u64 new_spte)
+static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu,
+				    struct kvm_page_fault *fault,
+				    u64 *sptep, u64 old_spte, u64 new_spte)
 {
 	/*
 	 * Theoretically we could also set dirty bit (and flush TLB) here in
@@ -4626,10 +4625,9 @@ static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
 #include "paging_tmpl.h"
 #undef PTTYPE
 
-static void
-__reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
-			u64 pa_bits_rsvd, int level, bool nx, bool gbpages,
-			bool pse, bool amd)
+static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
+				    u64 pa_bits_rsvd, int level, bool nx,
+				    bool gbpages, bool pse, bool amd)
 {
 	u64 gbpages_bit_rsvd = 0;
 	u64 nonleaf_bit8_rsvd = 0;
@@ -4742,9 +4740,9 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 				guest_cpuid_is_amd_or_hygon(vcpu));
 }
 
-static void
-__reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_check,
-			    u64 pa_bits_rsvd, bool execonly, int huge_page_level)
+static void __reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_check,
+					u64 pa_bits_rsvd, bool execonly,
+					int huge_page_level)
 {
 	u64 high_bits_rsvd = pa_bits_rsvd & rsvd_bits(0, 51);
 	u64 large_1g_rsvd = 0, large_2m_rsvd = 0;
@@ -4844,8 +4842,7 @@ static inline bool boot_cpu_is_amd(void)
  * the direct page table on host, use as much mmu features as
  * possible, however, kvm currently does not do execution-protection.
  */
-static void
-reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
+static void reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
 {
 	struct rsvd_bits_validate *shadow_zero_check;
 	int i;
@@ -5060,8 +5057,8 @@ static void paging32_init_context(struct kvm_mmu *context)
 	context->invlpg = paging32_invlpg;
 }
 
-static union kvm_cpu_role
-kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
+static union kvm_cpu_role kvm_calc_cpu_role(struct kvm_vcpu *vcpu,
+					    const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_cpu_role role = {0};
 
@@ -6653,8 +6650,8 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
 	}
 }
 
-static unsigned long
-mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+static unsigned long mmu_shrink_scan(struct shrinker *shrink,
+				     struct shrink_control *sc)
 {
 	struct kvm *kvm;
 	int nr_to_scan = sc->nr_to_scan;
@@ -6712,8 +6709,8 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 	return freed;
 }
 
-static unsigned long
-mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+static unsigned long mmu_shrink_count(struct shrinker *shrink,
+				      struct shrink_control *sc)
 {
 	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
 }
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/21] KVM: x86/MMU: Add shadow_mmu.(c|h)
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (2 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 03/21] KVM: x86/mmu: Clean up mmu.c functions that put return type on separate line Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 05/21] KVM: x86/MMU: Expose functions for the Shadow MMU Ben Gardon
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

As a first step to splitting the Shadow MMU out of KVM MMU common code,
add separate files for it with some of the boilerplate and includes the
Shadow MMU will need.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/Makefile         |  2 +-
 arch/x86/kvm/mmu/mmu.c        |  1 +
 arch/x86/kvm/mmu/shadow_mmu.c | 23 +++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h | 21 +++++++++++++++++++++
 4 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.c
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.h

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 80e3fe184d17e..d6e94660b006e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,7 +12,7 @@ include $(srctree)/virt/kvm/Makefile.kvm
 kvm-y			+= x86.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
 			   hyperv.o debugfs.o mmu/mmu.o mmu/page_track.o \
-			   mmu/spte.o
+			   mmu/spte.o mmu/shadow_mmu.o
 
 ifdef CONFIG_HYPERV
 kvm-y			+= kvm_onhyperv.o
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3674bde2203b2..752c38d625a32 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -21,6 +21,7 @@
 #include "mmu.h"
 #include "mmu_internal.h"
 #include "tdp_mmu.h"
+#include "shadow_mmu.h"
 #include "x86.h"
 #include "kvm_cache_regs.h"
 #include "smm.h"
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
new file mode 100644
index 0000000000000..eee5a6796d9b0
--- /dev/null
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM Shadow MMU
+ *
+ * Extracted from mmu.c
+ *
+ * Copyright (C) 2006 Qumranet, Inc.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (C) 2023, Google, Inc.
+ *
+ * Original authors:
+ *   Yaniv Kamay  <yaniv@qumranet.com>
+ *   Avi Kivity   <avi@qumranet.com>
+ */
+#include "mmu.h"
+#include "mmu_internal.h"
+#include "mmutrace.h"
+#include "shadow_mmu.h"
+#include "spte.h"
+
+#include <asm/vmx.h>
+#include <asm/cmpxchg.h>
+#include <trace/events/kvm.h>
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
new file mode 100644
index 0000000000000..2bfba6ad20688
--- /dev/null
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KVM Shadow MMU
+ *
+ * Extracted from mmu.c
+ *
+ * Copyright (C) 2006 Qumranet, Inc.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (C) 2023, Google, Inc.
+ *
+ * Original authors:
+ *   Yaniv Kamay  <yaniv@qumranet.com>
+ *   Avi Kivity   <avi@qumranet.com>
+ */
+
+#ifndef __KVM_X86_MMU_SHADOW_MMU_H
+#define __KVM_X86_MMU_SHADOW_MMU_H
+
+#include <linux/kvm_host.h>
+
+#endif /* __KVM_X86_MMU_SHADOW_MMU_H */
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/21] KVM: x86/MMU: Expose functions for the Shadow MMU
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (3 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 04/21] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36() Ben Gardon
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Expose various common MMU functions which the Shadow MMU will need via
mmu_internal.h. This slightly reduces the work needed to move the
shadow MMU code out of mmu.c, which will already be a massive change.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 41 ++++++++++++++-------------------
 arch/x86/kvm/mmu/mmu_internal.h | 24 +++++++++++++++++++
 2 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 752c38d625a32..12d38a8772a80 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -164,9 +164,9 @@ struct kvm_shadow_walk_iterator {
 		({ spte = mmu_spte_get_lockless(_walker.sptep); 1; });	\
 	     __shadow_walk_next(&(_walker), spte))
 
-static struct kmem_cache *pte_list_desc_cache;
+struct kmem_cache *pte_list_desc_cache;
 struct kmem_cache *mmu_page_header_cache;
-static struct percpu_counter kvm_total_used_mmu_pages;
+struct percpu_counter kvm_total_used_mmu_pages;
 
 static void mmu_spte_set(u64 *sptep, u64 spte);
 
@@ -242,11 +242,6 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
 	return regs;
 }
 
-static inline bool kvm_available_flush_tlb_with_range(void)
-{
-	return kvm_x86_ops.tlb_remote_flush_with_range;
-}
-
 static void kvm_flush_remote_tlbs_with_range(struct kvm *kvm,
 		struct kvm_tlb_range *range)
 {
@@ -270,8 +265,8 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 	kvm_flush_remote_tlbs_with_range(kvm, &range);
 }
 
-static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
-			   unsigned int access)
+void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
+		    unsigned int access)
 {
 	u64 spte = make_mmio_spte(vcpu, gfn, access);
 
@@ -623,7 +618,7 @@ static inline bool is_tdp_mmu_active(struct kvm_vcpu *vcpu)
 	return tdp_mmu_enabled && vcpu->arch.mmu->root_role.direct;
 }
 
-static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
+void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 {
 	if (is_tdp_mmu_active(vcpu)) {
 		kvm_tdp_mmu_walk_lockless_begin();
@@ -642,7 +637,7 @@ static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 	}
 }
 
-static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
+void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 {
 	if (is_tdp_mmu_active(vcpu)) {
 		kvm_tdp_mmu_walk_lockless_end();
@@ -835,8 +830,8 @@ void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 		      &kvm->arch.possible_nx_huge_pages);
 }
 
-static void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-				 bool nx_huge_page_possible)
+void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+			  bool nx_huge_page_possible)
 {
 	sp->nx_huge_page_disallowed = true;
 
@@ -870,16 +865,15 @@ void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 	list_del_init(&sp->possible_nx_huge_page_link);
 }
 
-static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	sp->nx_huge_page_disallowed = false;
 
 	untrack_possible_nx_huge_page(kvm, sp);
 }
 
-static struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
-							   gfn_t gfn,
-							   bool no_dirty_log)
+struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+						    gfn_t gfn, bool no_dirty_log)
 {
 	struct kvm_memory_slot *slot;
 
@@ -1415,7 +1409,7 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 	return write_protected;
 }
 
-static bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn)
+bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn)
 {
 	struct kvm_memory_slot *slot;
 
@@ -1914,9 +1908,8 @@ static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	return ret;
 }
 
-static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
-					struct list_head *invalid_list,
-					bool remote_flush)
+bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list,
+				 bool remote_flush)
 {
 	if (!remote_flush && list_empty(invalid_list))
 		return false;
@@ -1928,7 +1921,7 @@ static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
 	return true;
 }
 
-static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
+bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	if (sp->role.invalid)
 		return true;
@@ -6216,7 +6209,7 @@ static inline bool need_topup(struct kvm_mmu_memory_cache *cache, int min)
 	return kvm_mmu_memory_cache_nr_free_objects(cache) < min;
 }
 
-static bool need_topup_split_caches_or_resched(struct kvm *kvm)
+bool need_topup_split_caches_or_resched(struct kvm *kvm)
 {
 	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
 		return true;
@@ -6231,7 +6224,7 @@ static bool need_topup_split_caches_or_resched(struct kvm *kvm)
 	       need_topup(&kvm->arch.split_shadow_page_cache, 1);
 }
 
-static int topup_split_caches(struct kvm *kvm)
+int topup_split_caches(struct kvm *kvm)
 {
 	/*
 	 * Allocating rmap list entries when splitting huge pages for nested
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index ac00bfbf32f67..95f0adfb3b0b4 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -131,7 +131,9 @@ struct kvm_mmu_page {
 #endif
 };
 
+extern struct kmem_cache *pte_list_desc_cache;
 extern struct kmem_cache *mmu_page_header_cache;
+extern struct percpu_counter kvm_total_used_mmu_pages;
 
 static inline int kvm_mmu_role_as_id(union kvm_mmu_page_role role)
 {
@@ -323,6 +325,28 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
 void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 
 void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
+void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+			  bool nx_huge_page_possible);
 void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
+void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
+static inline bool kvm_available_flush_tlb_with_range(void)
+{
+	return kvm_x86_ops.tlb_remote_flush_with_range;
+}
+
+void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
+		    unsigned int access);
+struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+						    gfn_t gfn, bool no_dirty_log);
+bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn);
+bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list,
+				 bool remote_flush);
+bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
+
+void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu);
+void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu);
+
+bool need_topup_split_caches_or_resched(struct kvm *kvm);
+int topup_split_caches(struct kvm *kvm);
 #endif /* __KVM_X86_MMU_INTERNAL_H */
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36()
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (4 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 05/21] KVM: x86/MMU: Expose functions for the Shadow MMU Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-03-20 18:51   ` Sean Christopherson
  2023-02-02 18:27 ` [PATCH 08/21] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

is_cpuid_PSE36() always returns 1 and is never overridden, so just get
rid of the function. This saves having to export it in a future commit
in order to move the include of paging_tmpl.h out of mmu.c.

No functional change intended.

Suggested-by: David Matlack <dmatlack@google.com>

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 13 ++-----------
 arch/x86/kvm/mmu/paging_tmpl.h |  2 +-
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 12d38a8772a80..35cb59737c0a3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -304,11 +304,6 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte)
 	return likely(kvm_gen == spte_gen);
 }
 
-static int is_cpuid_PSE36(void)
-{
-	return 1;
-}
-
 #ifdef CONFIG_X86_64
 static void __set_spte(u64 *sptep, u64 spte)
 {
@@ -4661,12 +4656,8 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
 			break;
 		}
 
-		if (is_cpuid_PSE36())
-			/* 36bits PSE 4MB page */
-			rsvd_check->rsvd_bits_mask[1][1] = rsvd_bits(17, 21);
-		else
-			/* 32 bits PSE 4MB page */
-			rsvd_check->rsvd_bits_mask[1][1] = rsvd_bits(13, 21);
+		/* 36bits PSE 4MB page */
+		rsvd_check->rsvd_bits_mask[1][1] = rsvd_bits(17, 21);
 		break;
 	case PT32E_ROOT_LEVEL:
 		rsvd_check->rsvd_bits_mask[0][2] = rsvd_bits(63, 63) |
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index e5662dbd519c4..730b413eebfde 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -426,7 +426,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	gfn += (addr & PT_LVL_OFFSET_MASK(walker->level)) >> PAGE_SHIFT;
 
 #if PTTYPE == 32
-	if (walker->level > PG_LEVEL_4K && is_cpuid_PSE36())
+	if (walker->level > PG_LEVEL_4K)
 		gfn += pse36_gfn_delta(pte);
 #endif
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/21] KVM: x86/MMU: Expose functions for paging_tmpl.h
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (5 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36() Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

In preparation for moving paging_tmpl.h to shadow_mmu.c, expose various
functions it needs through mmu_internal.h. This includes moving all the
BUILD_MMU_ROLE_*() macros. Not all of those macros are strictly needed
by paging_tmpl.h, but it is cleaner to keep them together.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 68 +++++----------------------------
 arch/x86/kvm/mmu/mmu_internal.h | 59 ++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 59 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2162dfda9601f..da290bfca0137 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -123,57 +123,9 @@ struct kmem_cache *pte_list_desc_cache;
 struct kmem_cache *mmu_page_header_cache;
 struct percpu_counter kvm_total_used_mmu_pages;
 
-struct kvm_mmu_role_regs {
-	const unsigned long cr0;
-	const unsigned long cr4;
-	const u64 efer;
-};
-
 #define CREATE_TRACE_POINTS
 #include "mmutrace.h"
 
-/*
- * Yes, lot's of underscores.  They're a hint that you probably shouldn't be
- * reading from the role_regs.  Once the root_role is constructed, it becomes
- * the single source of truth for the MMU's state.
- */
-#define BUILD_MMU_ROLE_REGS_ACCESSOR(reg, name, flag)			\
-static inline bool __maybe_unused					\
-____is_##reg##_##name(const struct kvm_mmu_role_regs *regs)		\
-{									\
-	return !!(regs->reg & flag);					\
-}
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr0, pg, X86_CR0_PG);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr0, wp, X86_CR0_WP);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pse, X86_CR4_PSE);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pae, X86_CR4_PAE);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, smep, X86_CR4_SMEP);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, smap, X86_CR4_SMAP);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pke, X86_CR4_PKE);
-BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, la57, X86_CR4_LA57);
-BUILD_MMU_ROLE_REGS_ACCESSOR(efer, nx, EFER_NX);
-BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
-
-/*
- * The MMU itself (with a valid role) is the single source of truth for the
- * MMU.  Do not use the regs used to build the MMU/role, nor the vCPU.  The
- * regs don't account for dependencies, e.g. clearing CR4 bits if CR0.PG=1,
- * and the vCPU may be incorrect/irrelevant.
- */
-#define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
-static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
-{								\
-	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
-}
-BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pse);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smep);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smap);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pke);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, la57);
-BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
-BUILD_MMU_ROLE_ACCESSOR(ext,  efer, lma);
-
 static inline bool is_cr0_pg(struct kvm_mmu *mmu)
 {
         return mmu->cpu_role.base.level > 0;
@@ -218,7 +170,7 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 	kvm_flush_remote_tlbs_with_range(kvm, &range);
 }
 
-static gfn_t get_mmio_spte_gfn(u64 spte)
+gfn_t get_mmio_spte_gfn(u64 spte)
 {
 	u64 gpa = spte & shadow_nonpresent_or_rsvd_lower_gfn_mask;
 
@@ -287,7 +239,7 @@ void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 	}
 }
 
-static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
+int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
 {
 	int r;
 
@@ -828,9 +780,8 @@ static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fa
 	return -EFAULT;
 }
 
-static int kvm_handle_noslot_fault(struct kvm_vcpu *vcpu,
-				   struct kvm_page_fault *fault,
-				   unsigned int access)
+int kvm_handle_noslot_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+			    unsigned int access)
 {
 	gva_t gva = fault->is_tdp ? 0 : fault->addr;
 
@@ -1284,8 +1235,8 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 	return RET_PF_RETRY;
 }
 
-static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
-					 struct kvm_page_fault *fault)
+bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
+				  struct kvm_page_fault *fault)
 {
 	if (unlikely(fault->rsvd))
 		return false;
@@ -1408,8 +1359,8 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	return RET_PF_CONTINUE;
 }
 
-static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
-			   unsigned int access)
+int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+		    unsigned int access)
 {
 	int ret;
 
@@ -1433,8 +1384,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
  * Returns true if the page fault is stale and needs to be retried, i.e. if the
  * root was invalidated by a memslot update or a relevant mmu_notifier fired.
  */
-static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
-				struct kvm_page_fault *fault)
+bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
 	struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa);
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 9c1399762496b..349d4a300ad34 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -347,6 +347,65 @@ bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
 void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu);
 void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu);
 
+int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect);
 bool need_topup_split_caches_or_resched(struct kvm *kvm);
 int topup_split_caches(struct kvm *kvm);
+
+bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
+				  struct kvm_page_fault *fault);
+int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+		    unsigned int access);
+int kvm_handle_noslot_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+			    unsigned int access);
+
+gfn_t get_mmio_spte_gfn(u64 spte);
+
+struct kvm_mmu_role_regs {
+	const unsigned long cr0;
+	const unsigned long cr4;
+	const u64 efer;
+};
+
+/*
+ * Yes, lot's of underscores.  They're a hint that you probably shouldn't be
+ * reading from the role_regs.  Once the root_role is constructed, it becomes
+ * the single source of truth for the MMU's state.
+ */
+#define BUILD_MMU_ROLE_REGS_ACCESSOR(reg, name, flag)			\
+static inline bool __maybe_unused					\
+____is_##reg##_##name(const struct kvm_mmu_role_regs *regs)		\
+{									\
+	return !!(regs->reg & flag);					\
+}
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr0, pg, X86_CR0_PG);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr0, wp, X86_CR0_WP);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pse, X86_CR4_PSE);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pae, X86_CR4_PAE);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, smep, X86_CR4_SMEP);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, smap, X86_CR4_SMAP);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, pke, X86_CR4_PKE);
+BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, la57, X86_CR4_LA57);
+BUILD_MMU_ROLE_REGS_ACCESSOR(efer, nx, EFER_NX);
+BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
+
+/*
+ * The MMU itself (with a valid role) is the single source of truth for the
+ * MMU.  Do not use the regs used to build the MMU/role, nor the vCPU.  The
+ * regs don't account for dependencies, e.g. clearing CR4 bits if CR0.PG=1,
+ * and the vCPU may be incorrect/irrelevant.
+ */
+#define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
+static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
+{								\
+	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
+}
+BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
+BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pse);
+BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smep);
+BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smap);
+BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pke);
+BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, la57);
+BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
+BUILD_MMU_ROLE_ACCESSOR(ext,  efer, lma);
 #endif /* __KVM_X86_MMU_INTERNAL_H */
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (6 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 08/21] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-03-20 18:41   ` Sean Christopherson
  2023-02-02 18:27 ` [PATCH 10/21] KVM: x86/MMU: Clean up Shadow MMU exports Ben Gardon
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Move the integration point for paging_tmpl.h to shadow_mmu.c since
paging_tmpl.h is ostensibly part of the Shadow MMU. This requires
modifying some of the definitions to be non-static and then exporting
the pre-processed function names through shadow_mmu.h since they are
needed for mmu context callbacks in mmu.c. This will facilitate cleanups
in following commits because many of the functions being exposed by
shadow_mmu.h are only needed by paging_tmpl.h. Those functions will no
longer need to be exported.

sync_mmio_spte() is only used by paging_tmpl.h, so move it along with
the includes.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 29 -----------------------------
 arch/x86/kvm/mmu/paging_tmpl.h | 11 +++++------
 arch/x86/kvm/mmu/shadow_mmu.c  | 31 +++++++++++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h  | 25 ++++++++++++++++++++++++-
 4 files changed, 60 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index da290bfca0137..cef481a17a519 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1697,35 +1697,6 @@ static unsigned long get_cr3(struct kvm_vcpu *vcpu)
 	return kvm_read_cr3(vcpu);
 }
 
-static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
-			   unsigned int access)
-{
-	if (unlikely(is_mmio_spte(*sptep))) {
-		if (gfn != get_mmio_spte_gfn(*sptep)) {
-			mmu_spte_clear_no_track(sptep);
-			return true;
-		}
-
-		mark_mmio_spte(vcpu, sptep, gfn, access);
-		return true;
-	}
-
-	return false;
-}
-
-#define PTTYPE_EPT 18 /* arbitrary */
-#define PTTYPE PTTYPE_EPT
-#include "paging_tmpl.h"
-#undef PTTYPE
-
-#define PTTYPE 64
-#include "paging_tmpl.h"
-#undef PTTYPE
-
-#define PTTYPE 32
-#include "paging_tmpl.h"
-#undef PTTYPE
-
 static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
 				    u64 pa_bits_rsvd, int level, bool nx,
 				    bool gbpages, bool pse, bool amd)
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 730b413eebfde..1251357794538 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -787,7 +787,7 @@ FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu,
  *  Returns: 1 if we need to emulate the instruction, 0 otherwise, or
  *           a negative value on error.
  */
-static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
+int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
 	struct guest_walker walker;
 	int r;
@@ -889,7 +889,7 @@ static gpa_t FNAME(get_level1_sp_gpa)(struct kvm_mmu_page *sp)
 	return gfn_to_gpa(sp->gfn) + offset * sizeof(pt_element_t);
 }
 
-static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
+void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
@@ -949,9 +949,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
 }
 
 /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GPA. */
-static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
-			       gpa_t addr, u64 access,
-			       struct x86_exception *exception)
+gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t addr,
+			u64 access, struct x86_exception *exception)
 {
 	struct guest_walker walker;
 	gpa_t gpa = INVALID_GPA;
@@ -984,7 +983,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
  *   0: the sp is synced and no tlb flushing is required
  * > 0: the sp is synced and tlb flushing is required
  */
-static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
+int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
 	union kvm_mmu_page_role root_role = vcpu->arch.mmu->root_role;
 	int i;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index f3e2ed5b675eb..c7cfdc6f51b53 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -12,6 +12,8 @@
  *   Yaniv Kamay  <yaniv@qumranet.com>
  *   Avi Kivity   <avi@qumranet.com>
  */
+
+#include "ioapic.h"
 #include "mmu.h"
 #include "mmu_internal.h"
 #include "mmutrace.h"
@@ -2809,6 +2811,35 @@ void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
 	walk_shadow_page_lockless_end(vcpu);
 }
 
+static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
+			   unsigned int access)
+{
+	if (unlikely(is_mmio_spte(*sptep))) {
+		if (gfn != get_mmio_spte_gfn(*sptep)) {
+			mmu_spte_clear_no_track(sptep);
+			return true;
+		}
+
+		mark_mmio_spte(vcpu, sptep, gfn, access);
+		return true;
+	}
+
+	return false;
+}
+
+#define PTTYPE_EPT 18 /* arbitrary */
+#define PTTYPE PTTYPE_EPT
+#include "paging_tmpl.h"
+#undef PTTYPE
+
+#define PTTYPE 64
+#include "paging_tmpl.h"
+#undef PTTYPE
+
+#define PTTYPE 32
+#include "paging_tmpl.h"
+#undef PTTYPE
+
 static bool is_obsolete_root(struct kvm *kvm, hpa_t root_hpa)
 {
 	struct kvm_mmu_page *sp;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 4534eadc9a17c..7faf8b06e68f1 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -86,7 +86,6 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 		       int level, pte_t unused);
 
 void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte);
-int nonpaging_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
 int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
 		      bool can_yield);
 void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
@@ -163,4 +162,28 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 				    const struct kvm_memory_slot *slot);
 
 unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
+
+/* Exports from paging_tmpl.h */
+gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+			  gpa_t vaddr, u64 access,
+			  struct x86_exception *exception);
+gpa_t paging64_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+			  gpa_t vaddr, u64 access,
+			  struct x86_exception *exception);
+gpa_t ept_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t vaddr,
+		     u64 access, struct x86_exception *exception);
+
+int paging32_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int paging64_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int ept_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+
+int paging32_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+int paging64_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+int ept_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+/* Defined in shadow_mmu.c. */
+int nonpaging_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+
+void paging32_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
+void paging64_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
+void ept_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
 #endif /* __KVM_X86_MMU_SHADOW_MMU_H */
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/21] KVM: x86/MMU: Clean up Shadow MMU exports
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (7 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-02 18:27 ` [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Now that paging_tmpl.h is included from shadow_mmu.c, there's no need to
export many of the functions currrently in shadow_mmu.h, so remove those
exports and mark the functions static. This cleans up the interface
of the Shadow MMU, and will allow the implementation to keep the details
of rmap_heads internal.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/shadow_mmu.c | 78 +++++++++++++++++++++--------------
 arch/x86/kvm/mmu/shadow_mmu.h | 51 +----------------------
 2 files changed, 48 insertions(+), 81 deletions(-)

diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index c7cfdc6f51b53..1be680bce15a6 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -24,6 +24,20 @@
 #include <asm/cmpxchg.h>
 #include <trace/events/kvm.h>
 
+struct kvm_shadow_walk_iterator {
+	u64 addr;
+	hpa_t shadow_addr;
+	u64 *sptep;
+	int level;
+	unsigned index;
+};
+
+#define for_each_shadow_entry_using_root(_vcpu, _root, _addr, _walker)     \
+	for (shadow_walk_init_using_root(&(_walker), (_vcpu),              \
+					 (_root), (_addr));                \
+	     shadow_walk_okay(&(_walker));			           \
+	     shadow_walk_next(&(_walker)))
+
 #define for_each_shadow_entry(_vcpu, _addr, _walker)            \
 	for (shadow_walk_init(&(_walker), _vcpu, _addr);	\
 	     shadow_walk_okay(&(_walker));			\
@@ -230,7 +244,7 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
  *
  * Returns true if the TLB needs to be flushed
  */
-bool mmu_spte_update(u64 *sptep, u64 new_spte)
+static bool mmu_spte_update(u64 *sptep, u64 new_spte)
 {
 	bool flush = false;
 	u64 old_spte = mmu_spte_update_no_track(sptep, new_spte);
@@ -314,7 +328,7 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
  * Directly clear spte without caring the state bits of sptep,
  * it is used to set the upper level spte.
  */
-void mmu_spte_clear_no_track(u64 *sptep)
+static void mmu_spte_clear_no_track(u64 *sptep)
 {
 	__update_clear_spte_fast(sptep, 0ull);
 }
@@ -357,7 +371,7 @@ static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
 
 static bool sp_has_gptes(struct kvm_mmu_page *sp);
 
-gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
+static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
 {
 	if (sp->role.passthrough)
 		return sp->gfn;
@@ -413,8 +427,8 @@ static void kvm_mmu_page_set_translation(struct kvm_mmu_page *sp, int index,
 	          sp->gfn, kvm_mmu_page_get_gfn(sp, index), gfn);
 }
 
-void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
-			     unsigned int access)
+static void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
+				    unsigned int access)
 {
 	gfn_t gfn = kvm_mmu_page_get_gfn(sp, index);
 
@@ -629,7 +643,7 @@ struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
 	return &slot->arch.rmap[level - PG_LEVEL_4K][idx];
 }
 
-bool rmap_can_add(struct kvm_vcpu *vcpu)
+static bool rmap_can_add(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu_memory_cache *mc;
 
@@ -737,7 +751,7 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
 	for (_spte_ = rmap_get_first(_rmap_head_, _iter_);		\
 	     _spte_; _spte_ = rmap_get_next(_iter_))
 
-void drop_spte(struct kvm *kvm, u64 *sptep)
+static void drop_spte(struct kvm *kvm, u64 *sptep)
 {
 	u64 old_spte = mmu_spte_clear_track_bits(kvm, sptep);
 
@@ -1114,7 +1128,7 @@ static void mmu_page_remove_parent_pte(struct kvm_mmu_page *sp,
 	pte_list_remove(parent_pte, &sp->parent_ptes);
 }
 
-void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte)
+static void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte)
 {
 	mmu_page_remove_parent_pte(sp, parent_pte);
 	mmu_spte_clear_no_track(parent_pte);
@@ -1344,8 +1358,8 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
 	} while (!sp->unsync_children);
 }
 
-int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
-		      bool can_yield)
+static int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
+			     bool can_yield)
 {
 	int i;
 	struct kvm_mmu_page *sp;
@@ -1391,7 +1405,7 @@ void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
 	atomic_set(&sp->write_flooding_count,  0);
 }
 
-void clear_sp_write_flooding_count(u64 *spte)
+static void clear_sp_write_flooding_count(u64 *spte)
 {
 	__clear_sp_write_flooding_count(sptep_to_sp(spte));
 }
@@ -1604,9 +1618,9 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct,
 	return role;
 }
 
-struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
-					  gfn_t gfn, bool direct,
-					  unsigned int access)
+static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
+						 u64 *sptep, gfn_t gfn,
+						 bool direct, unsigned int access)
 {
 	union kvm_mmu_page_role role;
 
@@ -1617,8 +1631,9 @@ struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
 	return kvm_mmu_get_shadow_page(vcpu, gfn, role);
 }
 
-void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
-				 struct kvm_vcpu *vcpu, hpa_t root, u64 addr)
+static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
+					struct kvm_vcpu *vcpu, hpa_t root,
+					u64 addr)
 {
 	iterator->addr = addr;
 	iterator->shadow_addr = root;
@@ -1645,14 +1660,14 @@ void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
 	}
 }
 
-void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
-		      struct kvm_vcpu *vcpu, u64 addr)
+static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
+			     struct kvm_vcpu *vcpu, u64 addr)
 {
 	shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root.hpa,
 				    addr);
 }
 
-bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator)
+static bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator)
 {
 	if (iterator->level < PG_LEVEL_4K)
 		return false;
@@ -1674,7 +1689,7 @@ static void __shadow_walk_next(struct kvm_shadow_walk_iterator *iterator,
 	--iterator->level;
 }
 
-void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
+static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
 {
 	__shadow_walk_next(iterator, *iterator->sptep);
 }
@@ -1714,13 +1729,14 @@ static void __link_shadow_page(struct kvm *kvm,
 		mark_unsync(sptep);
 }
 
-void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp)
+static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
+			     struct kvm_mmu_page *sp)
 {
 	__link_shadow_page(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, sptep, sp, true);
 }
 
-void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			  unsigned direct_access)
+static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
+				 unsigned direct_access)
 {
 	if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) {
 		struct kvm_mmu_page *child;
@@ -1742,8 +1758,8 @@ void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 }
 
 /* Returns the number of zapped non-leaf child shadow pages. */
-int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
-		     struct list_head *invalid_list)
+static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
+			    struct list_head *invalid_list)
 {
 	u64 pte;
 	struct kvm_mmu_page *child;
@@ -2156,9 +2172,9 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
 	return 0;
 }
 
-int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
-		 u64 *sptep, unsigned int pte_access, gfn_t gfn,
-		 kvm_pfn_t pfn, struct kvm_page_fault *fault)
+static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+			u64 *sptep, unsigned int pte_access, gfn_t gfn,
+			kvm_pfn_t pfn, struct kvm_page_fault *fault)
 {
 	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
 	int level = sp->role.level;
@@ -2263,8 +2279,8 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-void __direct_pte_prefetch(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
-			   u64 *sptep)
+static void __direct_pte_prefetch(struct kvm_vcpu *vcpu,
+				  struct kvm_mmu_page *sp, u64 *sptep)
 {
 	u64 *spte, *start = NULL;
 	int i;
@@ -2800,7 +2816,7 @@ int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
 	return leaf;
 }
 
-void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
+static void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	u64 spte;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 7faf8b06e68f1..9f16c4782bfbf 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -36,32 +36,11 @@ struct pte_list_desc {
 	u64 *sptes[PTE_LIST_EXT];
 };
 
+/* Only exported for debugfs.c. */
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
 
-struct kvm_shadow_walk_iterator {
-	u64 addr;
-	hpa_t shadow_addr;
-	u64 *sptep;
-	int level;
-	unsigned index;
-};
-
-#define for_each_shadow_entry_using_root(_vcpu, _root, _addr, _walker)     \
-	for (shadow_walk_init_using_root(&(_walker), (_vcpu),              \
-					 (_root), (_addr));                \
-	     shadow_walk_okay(&(_walker));			           \
-	     shadow_walk_next(&(_walker)))
-
-bool mmu_spte_update(u64 *sptep, u64 new_spte);
-void mmu_spte_clear_no_track(u64 *sptep);
-gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index);
-void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
-			     unsigned int access);
-
 struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
 				  const struct kvm_memory_slot *slot);
-bool rmap_can_add(struct kvm_vcpu *vcpu);
-void drop_spte(struct kvm *kvm, u64 *sptep);
 bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect);
 bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			const struct kvm_memory_slot *slot);
@@ -85,30 +64,8 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 		       struct kvm_memory_slot *slot, gfn_t gfn,
 		       int level, pte_t unused);
 
-void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte);
-int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
-		      bool can_yield);
 void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
-void clear_sp_write_flooding_count(u64 *spte);
-
-struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
-					  gfn_t gfn, bool direct,
-					  unsigned int access);
-
-void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
-				 struct kvm_vcpu *vcpu, hpa_t root, u64 addr);
-void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
-		      struct kvm_vcpu *vcpu, u64 addr);
-bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator);
-void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator);
-
-void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp);
-
-void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-			  unsigned direct_access);
 
-int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
-		     struct list_head *invalid_list);
 bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 				struct list_head *invalid_list,
 				int *nr_zapped);
@@ -120,11 +77,6 @@ int make_mmu_pages_available(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 
-int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
-		 u64 *sptep, unsigned int pte_access, gfn_t gfn,
-		 kvm_pfn_t pfn, struct kvm_page_fault *fault);
-void __direct_pte_prefetch(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
-			   u64 *sptep);
 int direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
 
@@ -134,7 +86,6 @@ int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
 
 int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level);
 
-void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr);
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 		       int bytes, struct kvm_page_track_notifier_node *node);
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (8 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 10/21] KVM: x86/MMU: Clean up Shadow MMU exports Ben Gardon
@ 2023-02-02 18:27 ` Ben Gardon
  2023-02-04 13:52   ` kernel test robot
  2023-02-04 19:10   ` kernel test robot
  2023-02-02 18:28 ` [PATCH 12/21] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions Ben Gardon
                   ` (11 subsequent siblings)
  21 siblings, 2 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:27 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

The MMU shrinker currently only operates on the Shadow MMU, but having
the entire implemenatation in shadow_mmu.c is awkward since much of the
function isn't Shadow MMU specific. There has also been talk of changing
the target of the shrinker to the MMU caches rather than already allocated
page tables. As a result, it makes sense to move some of the implementation
back to mmu.c.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 43 ++++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.c | 62 ++++++++---------------------------
 arch/x86/kvm/mmu/shadow_mmu.h |  3 +-
 3 files changed, 58 insertions(+), 50 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index cef481a17a519..3ea54b08239aa 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3145,6 +3145,49 @@ static unsigned long mmu_shrink_count(struct shrinker *shrink,
 	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
 }
 
+unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct kvm *kvm;
+	int nr_to_scan = sc->nr_to_scan;
+	unsigned long freed = 0;
+
+	mutex_lock(&kvm_lock);
+
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		/*
+		 * Never scan more than sc->nr_to_scan VM instances.
+		 * Will not hit this condition practically since we do not try
+		 * to shrink more than one VM and it is very unlikely to see
+		 * !n_used_mmu_pages so many times.
+		 */
+		if (!nr_to_scan--)
+			break;
+
+		/*
+		 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
+		 * here. We may skip a VM instance errorneosly, but we do not
+		 * want to shrink a VM that only started to populate its MMU
+		 * anyway.
+		 */
+		if (!kvm->arch.n_used_mmu_pages &&
+		    !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
+			continue;
+
+		freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
+
+		/*
+		 * unfair on small ones
+		 * per-vm shrinkers cry out
+		 * sadness comes quickly
+		 */
+		list_move_tail(&kvm->vm_list, &vm_list);
+		break;
+	}
+
+	mutex_unlock(&kvm_lock);
+	return freed;
+}
+
 static struct shrinker mmu_shrinker = {
 	.count_objects = mmu_shrink_count,
 	.scan_objects = mmu_shrink_scan,
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 1be680bce15a6..76c50aca3c487 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3160,7 +3160,7 @@ void kvm_zap_obsolete_pages(struct kvm *kvm)
 	kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
 }
 
-static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
+bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
 {
 	return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
 }
@@ -3429,60 +3429,24 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
 }
 
-unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
 {
-	struct kvm *kvm;
-	int nr_to_scan = sc->nr_to_scan;
 	unsigned long freed = 0;
+	int idx;
 
-	mutex_lock(&kvm_lock);
-
-	list_for_each_entry(kvm, &vm_list, vm_list) {
-		int idx;
-		LIST_HEAD(invalid_list);
-
-		/*
-		 * Never scan more than sc->nr_to_scan VM instances.
-		 * Will not hit this condition practically since we do not try
-		 * to shrink more than one VM and it is very unlikely to see
-		 * !n_used_mmu_pages so many times.
-		 */
-		if (!nr_to_scan--)
-			break;
-		/*
-		 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
-		 * here. We may skip a VM instance errorneosly, but we do not
-		 * want to shrink a VM that only started to populate its MMU
-		 * anyway.
-		 */
-		if (!kvm->arch.n_used_mmu_pages &&
-		    !kvm_has_zapped_obsolete_pages(kvm))
-			continue;
-
-		idx = srcu_read_lock(&kvm->srcu);
-		write_lock(&kvm->mmu_lock);
-
-		if (kvm_has_zapped_obsolete_pages(kvm)) {
-			kvm_mmu_commit_zap_page(kvm,
-			      &kvm->arch.zapped_obsolete_pages);
-			goto unlock;
-		}
+	idx = srcu_read_lock(&kvm->srcu);
+	write_lock(&kvm->mmu_lock);
 
-		freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan);
+	if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
+		kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+		goto out;
+	}
 
-unlock:
-		write_unlock(&kvm->mmu_lock);
-		srcu_read_unlock(&kvm->srcu, idx);
+	freed = kvm_mmu_zap_oldest_mmu_pages(kvm, pages_to_free);
 
-		/*
-		 * unfair on small ones
-		 * per-vm shrinkers cry out
-		 * sadness comes quickly
-		 */
-		list_move_tail(&kvm->vm_list, &vm_list);
-		break;
-	}
+out:
+	write_unlock(&kvm->mmu_lock);
+	srcu_read_unlock(&kvm->srcu, idx);
 
-	mutex_unlock(&kvm_lock);
 	return freed;
 }
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 9f16c4782bfbf..9e27d03fbe368 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -112,7 +112,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
 void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 				    const struct kvm_memory_slot *slot);
 
-unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
+bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
+unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
 
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/21] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (9 preceding siblings ...)
  2023-02-02 18:27 ` [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 13/21] KVM: x86/MMU: Fix naming on prepare / commit zap page functions Ben Gardon
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Change the naming scheme on several functions exported from the shadow
MMU to match the naming scheme used by the TDP MMU: kvm_shadow_mmu_.
More cleanups will follow to convert the remaining functions to a similar
naming scheme, but for now, start with the trivial renames.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 19 ++++++++++---------
 arch/x86/kvm/mmu/paging_tmpl.h |  2 +-
 arch/x86/kvm/mmu/shadow_mmu.c  | 19 ++++++++++---------
 arch/x86/kvm/mmu/shadow_mmu.h  | 17 +++++++++--------
 4 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3ea54b08239aa..9308ab8102f9b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1089,7 +1089,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 	int r;
 
 	write_lock(&vcpu->kvm->mmu_lock);
-	r = make_mmu_pages_available(vcpu);
+	r = kvm_shadow_mmu_make_pages_available(vcpu);
 	if (r < 0)
 		goto out_unlock;
 
@@ -1164,7 +1164,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	if (is_tdp_mmu_active(vcpu))
 		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
 	else
-		leaf = get_walk(vcpu, addr, sptes, &root);
+		leaf = kvm_shadow_mmu_get_walk(vcpu, addr, sptes, &root);
 
 	walk_shadow_page_lockless_end(vcpu);
 
@@ -1432,11 +1432,11 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	if (is_page_fault_stale(vcpu, fault))
 		goto out_unlock;
 
-	r = make_mmu_pages_available(vcpu);
+	r = kvm_shadow_mmu_make_pages_available(vcpu);
 	if (r)
 		goto out_unlock;
 
-	r = direct_map(vcpu, fault);
+	r = kvm_shadow_mmu_direct_map(vcpu, fault);
 
 out_unlock:
 	write_unlock(&vcpu->kvm->mmu_lock);
@@ -1471,7 +1471,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
 		trace_kvm_page_fault(vcpu, fault_address, error_code);
 
 		if (kvm_event_needs_reinjection(vcpu))
-			kvm_mmu_unprotect_page_virt(vcpu, fault_address);
+			kvm_shadow_mmu_unprotect_page_virt(vcpu, fault_address);
 		r = kvm_mmu_page_fault(vcpu, fault_address, error_code, insn,
 				insn_len);
 	} else if (flags & KVM_PV_REASON_PAGE_NOT_PRESENT) {
@@ -2786,7 +2786,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 	 * In order to ensure all vCPUs drop their soon-to-be invalid roots,
 	 * invalidating TDP MMU roots must be done while holding mmu_lock for
 	 * write and in the same critical section as making the reload request,
-	 * e.g. before kvm_zap_obsolete_pages() could drop mmu_lock and yield.
+	 * e.g. before kvm_shadow_mmu_zap_obsolete_pages() could drop mmu_lock
+	 * and yield.
 	 */
 	if (tdp_mmu_enabled)
 		kvm_tdp_mmu_invalidate_all_roots(kvm);
@@ -2801,7 +2802,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 	 */
 	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
 
-	kvm_zap_obsolete_pages(kvm);
+	kvm_shadow_mmu_zap_obsolete_pages(kvm);
 
 	write_unlock(&kvm->mmu_lock);
 
@@ -2890,7 +2891,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 
 	kvm_mmu_invalidate_begin(kvm, 0, -1ul);
 
-	flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);
+	flush = kvm_shadow_mmu_zap_gfn_range(kvm, gfn_start, gfn_end);
 
 	if (tdp_mmu_enabled) {
 		for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
@@ -3034,7 +3035,7 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		kvm_rmap_zap_collapsible_sptes(kvm, slot);
+		kvm_shadow_mmu_zap_collapsible_sptes(kvm, slot);
 		write_unlock(&kvm->mmu_lock);
 	}
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 1251357794538..14a8c8217c4cf 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -866,7 +866,7 @@ int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 	if (is_page_fault_stale(vcpu, fault))
 		goto out_unlock;
 
-	r = make_mmu_pages_available(vcpu);
+	r = kvm_shadow_mmu_make_pages_available(vcpu);
 	if (r)
 		goto out_unlock;
 	r = FNAME(fetch)(vcpu, fault, &walker);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 76c50aca3c487..36b335d75aee2 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1977,7 +1977,7 @@ static inline unsigned long kvm_mmu_available_pages(struct kvm *kvm)
 	return 0;
 }
 
-int make_mmu_pages_available(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu)
 {
 	unsigned long avail = kvm_mmu_available_pages(vcpu->kvm);
 
@@ -2041,7 +2041,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 	return r;
 }
 
-int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
+int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 {
 	gpa_t gpa;
 	int r;
@@ -2331,7 +2331,7 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
 	__direct_pte_prefetch(vcpu, sp, sptep);
 }
 
-int direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
+int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
 	struct kvm_shadow_walk_iterator it;
 	struct kvm_mmu_page *sp;
@@ -2549,7 +2549,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		return r;
 
 	write_lock(&vcpu->kvm->mmu_lock);
-	r = make_mmu_pages_available(vcpu);
+	r = kvm_shadow_mmu_make_pages_available(vcpu);
 	if (r < 0)
 		goto out_unlock;
 
@@ -2797,7 +2797,8 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu)
  *
  * Must be called between walk_shadow_page_lockless_{begin,end}.
  */
-int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
+int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+			    int *root_level)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	int leaf = -1;
@@ -3104,7 +3105,7 @@ __always_inline bool walk_slot_rmaps_4k(struct kvm *kvm,
 }
 
 #define BATCH_ZAP_PAGES	10
-void kvm_zap_obsolete_pages(struct kvm *kvm)
+void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
 {
 	struct kvm_mmu_page *sp, *node;
 	int nr_zapped, batch = 0;
@@ -3165,7 +3166,7 @@ bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
 	return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
 }
 
-bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
+bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 {
 	const struct kvm_memory_slot *memslot;
 	struct kvm_memslots *slots;
@@ -3417,8 +3418,8 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 	return need_tlb_flush;
 }
 
-void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
-				    const struct kvm_memory_slot *slot)
+void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
+					  const struct kvm_memory_slot *slot)
 {
 	/*
 	 * Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 9e27d03fbe368..cc28895d2a24f 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -73,18 +73,19 @@ bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 			      struct list_head *invalid_list);
 void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list);
 
-int make_mmu_pages_available(struct kvm_vcpu *vcpu);
+int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
 
-int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
+int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 
-int direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
 
 hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level);
 int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
 int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
 
-int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level);
+int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+			    int *root_level);
 
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 		       int bytes, struct kvm_page_track_notifier_node *node);
@@ -99,8 +100,8 @@ bool walk_slot_rmaps(struct kvm *kvm, const struct kvm_memory_slot *slot,
 bool walk_slot_rmaps_4k(struct kvm *kvm, const struct kvm_memory_slot *slot,
 			slot_rmaps_handler fn, bool flush_on_yield);
 
-void kvm_zap_obsolete_pages(struct kvm *kvm);
-bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
+void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
+bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
 
 bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 			     const struct kvm_memory_slot *slot);
@@ -109,8 +110,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
 					 const struct kvm_memory_slot *slot,
 					 gfn_t start, gfn_t end,
 					 int target_level);
-void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
-				    const struct kvm_memory_slot *slot);
+void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
+					  const struct kvm_memory_slot *slot);
 
 bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
 unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 13/21] KVM: x86/MMU: Fix naming on prepare / commit zap page functions
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (10 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 12/21] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 14/21] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c Ben Gardon
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Since the various prepare / commit zap page functions are part of the
Shadow MMU and used all over both shadow_mmu.c and mmu.c, add _shadow_
to the function names to match the rest of the Shadow MMU interface.
Since there are so many uses of these functions, this rename gets its
own commit.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 21 +++++++--------
 arch/x86/kvm/mmu/shadow_mmu.c | 48 ++++++++++++++++++-----------------
 arch/x86/kvm/mmu/shadow_mmu.h | 13 +++++-----
 3 files changed, 43 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9308ab8102f9b..9b217e04cab0e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -230,8 +230,9 @@ void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 		kvm_tdp_mmu_walk_lockless_end();
 	} else {
 		/*
-		 * Make sure the write to vcpu->mode is not reordered in front of
-		 * reads to sptes.  If it does, kvm_mmu_commit_zap_page() can see us
+		 * Make sure the write to vcpu->mode is not reordered in front
+		 * of reads to sptes.  If it does,
+		 * kvm_shadow_mmu_commit_zap_page() can see us
 		 * OUTSIDE_GUEST_MODE and proceed to free the shadow page table.
 		 */
 		smp_store_release(&vcpu->mode, OUTSIDE_GUEST_MODE);
@@ -568,7 +569,7 @@ bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list
 		return false;
 
 	if (!list_empty(invalid_list))
-		kvm_mmu_commit_zap_page(kvm, invalid_list);
+		kvm_shadow_mmu_commit_zap_page(kvm, invalid_list);
 	else
 		kvm_flush_remote_tlbs(kvm);
 	return true;
@@ -1022,7 +1023,7 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
 	if (is_tdp_mmu_page(sp))
 		kvm_tdp_mmu_put_root(kvm, sp, false);
 	else if (!--sp->root_count && sp->role.invalid)
-		kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
+		kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list);
 
 	*root_hpa = INVALID_PAGE;
 }
@@ -1075,7 +1076,7 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
 		mmu->root.pgd = 0;
 	}
 
-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 	write_unlock(&kvm->mmu_lock);
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_free_roots);
@@ -1397,8 +1398,8 @@ bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 	 * there is a pending request to free obsolete roots.  The request is
 	 * only a hint that the current root _may_ be obsolete and needs to be
 	 * reloaded, e.g. if the guest frees a PGD that KVM is tracking as a
-	 * previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs
-	 * to reload even if no vCPU is actively using the root.
+	 * previous root, then __kvm_shadow_mmu_prepare_zap_page() signals all
+	 * vCPUs to reload even if no vCPU is actively using the root.
 	 */
 	if (!sp && kvm_test_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu))
 		return true;
@@ -3101,13 +3102,13 @@ void kvm_mmu_zap_all(struct kvm *kvm)
 	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
 		if (WARN_ON(sp->role.invalid))
 			continue;
-		if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
+		if (__kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
 			goto restart;
 		if (cond_resched_rwlock_write(&kvm->mmu_lock))
 			goto restart;
 	}
 
-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 
 	if (tdp_mmu_enabled)
 		kvm_tdp_mmu_zap_all(kvm);
@@ -3457,7 +3458,7 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
 		else if (is_tdp_mmu_page(sp))
 			flush |= kvm_tdp_mmu_zap_sp(kvm, sp);
 		else
-			kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+			kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list);
 		WARN_ON_ONCE(sp->nx_huge_page_disallowed);
 
 		if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 36b335d75aee2..32a24530cf19a 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1282,7 +1282,7 @@ static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	int ret = vcpu->arch.mmu->sync_page(vcpu, sp);
 
 	if (ret < 0)
-		kvm_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
+		kvm_shadow_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
 	return ret;
 }
 
@@ -1444,8 +1444,8 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
 			 * upper-level page will be write-protected.
 			 */
 			if (role.level > PG_LEVEL_4K && sp->unsync)
-				kvm_mmu_prepare_zap_page(kvm, sp,
-							 &invalid_list);
+				kvm_shadow_mmu_prepare_zap_page(kvm, sp,
+								&invalid_list);
 			continue;
 		}
 
@@ -1487,7 +1487,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
 	++kvm->stat.mmu_cache_miss;
 
 out:
-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 
 	if (collisions > kvm->stat.max_mmu_page_hash_collisions)
 		kvm->stat.max_mmu_page_hash_collisions = collisions;
@@ -1779,8 +1779,8 @@ static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
 			 */
 			if (tdp_enabled && invalid_list &&
 			    child->role.guest_mode && !child->parent_ptes.val)
-				return kvm_mmu_prepare_zap_page(kvm, child,
-								invalid_list);
+				return kvm_shadow_mmu_prepare_zap_page(kvm,
+							child, invalid_list);
 		}
 	} else if (is_mmio_spte(pte)) {
 		mmu_spte_clear_no_track(spte);
@@ -1825,7 +1825,7 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
 		struct kvm_mmu_page *sp;
 
 		for_each_sp(pages, sp, parents, i) {
-			kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
+			kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list);
 			mmu_pages_clear_parents(&parents);
 			zapped++;
 		}
@@ -1834,9 +1834,9 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
 	return zapped;
 }
 
-bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-				struct list_head *invalid_list,
-				int *nr_zapped)
+bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+				       struct list_head *invalid_list,
+				       int *nr_zapped)
 {
 	bool list_unstable, zapped_root = false;
 
@@ -1898,16 +1898,17 @@ bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 	return list_unstable;
 }
 
-bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-			      struct list_head *invalid_list)
+bool kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+				     struct list_head *invalid_list)
 {
 	int nr_zapped;
 
-	__kvm_mmu_prepare_zap_page(kvm, sp, invalid_list, &nr_zapped);
+	__kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list, &nr_zapped);
 	return nr_zapped;
 }
 
-void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list)
+void kvm_shadow_mmu_commit_zap_page(struct kvm *kvm,
+				    struct list_head *invalid_list)
 {
 	struct kvm_mmu_page *sp, *nsp;
 
@@ -1952,8 +1953,8 @@ static unsigned long kvm_mmu_zap_oldest_mmu_pages(struct kvm *kvm,
 		if (sp->root_count)
 			continue;
 
-		unstable = __kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list,
-						      &nr_zapped);
+		unstable = __kvm_shadow_mmu_prepare_zap_page(kvm, sp,
+						&invalid_list, &nr_zapped);
 		total_zapped += nr_zapped;
 		if (total_zapped >= nr_to_zap)
 			break;
@@ -1962,7 +1963,7 @@ static unsigned long kvm_mmu_zap_oldest_mmu_pages(struct kvm *kvm,
 			goto restart;
 	}
 
-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 
 	kvm->stat.mmu_recycled += total_zapped;
 	return total_zapped;
@@ -2033,9 +2034,9 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
 			 sp->role.word);
 		r = 1;
-		kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+		kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list);
 	}
-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 	write_unlock(&kvm->mmu_lock);
 
 	return r;
@@ -3032,7 +3033,8 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 	for_each_gfn_valid_sp_with_gptes(vcpu->kvm, sp, gfn) {
 		if (detect_write_misaligned(sp, gpa, bytes) ||
 		      detect_write_flooding(sp)) {
-			kvm_mmu_prepare_zap_page(vcpu->kvm, sp, &invalid_list);
+			kvm_shadow_mmu_prepare_zap_page(vcpu->kvm, sp,
+							&invalid_list);
 			++vcpu->kvm->stat.mmu_flooded;
 			continue;
 		}
@@ -3141,7 +3143,7 @@ void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
 			goto restart;
 		}
 
-		unstable = __kvm_mmu_prepare_zap_page(kvm, sp,
+		unstable = __kvm_shadow_mmu_prepare_zap_page(kvm, sp,
 				&kvm->arch.zapped_obsolete_pages, &nr_zapped);
 		batch += nr_zapped;
 
@@ -3158,7 +3160,7 @@ void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
 	 * kvm_mmu_load()), and the reload in the caller ensure no vCPUs are
 	 * running with an obsolete MMU.
 	 */
-	kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+	kvm_shadow_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
 }
 
 bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
@@ -3439,7 +3441,7 @@ unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
 	write_lock(&kvm->mmu_lock);
 
 	if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
-		kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+		kvm_shadow_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
 		goto out;
 	}
 
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index cc28895d2a24f..82eed9bb9bc9a 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -66,12 +66,13 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 
 void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
 
-bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-				struct list_head *invalid_list,
-				int *nr_zapped);
-bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-			      struct list_head *invalid_list);
-void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list);
+bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+				       struct list_head *invalid_list,
+				       int *nr_zapped);
+bool kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+				     struct list_head *invalid_list);
+void kvm_shadow_mmu_commit_zap_page(struct kvm *kvm,
+				    struct list_head *invalid_list);
 
 int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 14/21] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (11 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 13/21] KVM: x86/MMU: Fix naming on prepare / commit zap page functions Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

There are several functions in mmu.c which bifrucate to the Shadow
and/or TDP MMU implementations. In most of these, the Shadow MMU
implementation is open-coded. Wrap these instances in a nice function
which just needs kvm and slot arguments or similar. This matches the TDP
MMU interface and will allow for some nice cleanups in a following
commit.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 52 ++++++----------------------
 arch/x86/kvm/mmu/shadow_mmu.c | 64 +++++++++++++++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h | 15 ++++++++
 3 files changed, 90 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9b217e04cab0e..44a00396284d5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -377,23 +377,13 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
-	struct kvm_rmap_head *rmap_head;
-
 	if (tdp_mmu_enabled)
 		kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
 				slot->base_gfn + gfn_offset, mask, true);
 
-	if (!kvm_memslots_have_rmaps(kvm))
-		return;
-
-	while (mask) {
-		rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
-					PG_LEVEL_4K, slot);
-		rmap_write_protect(rmap_head, false);
-
-		/* clear the first set bit */
-		mask &= mask - 1;
-	}
+	if (kvm_memslots_have_rmaps(kvm))
+		kvm_shadow_mmu_write_protect_pt_masked(kvm, slot, gfn_offset,
+						       mask);
 }
 
 /**
@@ -410,23 +400,13 @@ static void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
 					 struct kvm_memory_slot *slot,
 					 gfn_t gfn_offset, unsigned long mask)
 {
-	struct kvm_rmap_head *rmap_head;
-
 	if (tdp_mmu_enabled)
 		kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
 				slot->base_gfn + gfn_offset, mask, false);
 
-	if (!kvm_memslots_have_rmaps(kvm))
-		return;
-
-	while (mask) {
-		rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
-					PG_LEVEL_4K, slot);
-		__rmap_clear_dirty(kvm, rmap_head, slot);
-
-		/* clear the first set bit */
-		mask &= mask - 1;
-	}
+	if (kvm_memslots_have_rmaps(kvm))
+		kvm_shadow_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset,
+						     mask);
 }
 
 /**
@@ -484,16 +464,11 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 				    struct kvm_memory_slot *slot, u64 gfn,
 				    int min_level)
 {
-	struct kvm_rmap_head *rmap_head;
-	int i;
 	bool write_protected = false;
 
-	if (kvm_memslots_have_rmaps(kvm)) {
-		for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
-			rmap_head = gfn_to_rmap(gfn, i, slot);
-			write_protected |= rmap_write_protect(rmap_head, true);
-		}
-	}
+	if (kvm_memslots_have_rmaps(kvm))
+		write_protected |=
+			kvm_shadow_mmu_write_protect_gfn(kvm, slot, gfn, min_level);
 
 	if (tdp_mmu_enabled)
 		write_protected |=
@@ -2915,8 +2890,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		walk_slot_rmaps(kvm, memslot, slot_rmap_write_protect,
-				start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+		kvm_shadow_mmu_wrprot_slot(kvm, memslot, start_level);
 		write_unlock(&kvm->mmu_lock);
 	}
 
@@ -3067,11 +3041,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		/*
-		 * Clear dirty bits only on 4k SPTEs since the legacy MMU only
-		 * support dirty logging at a 4k granularity.
-		 */
-		walk_slot_rmaps_4k(kvm, memslot, __rmap_clear_dirty, false);
+		kvm_shadow_mmu_clear_dirty_slot(kvm, memslot);
 		write_unlock(&kvm->mmu_lock);
 	}
 
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 32a24530cf19a..b93a6174717d3 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3453,3 +3453,67 @@ unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
 
 	return freed;
 }
+
+void kvm_shadow_mmu_write_protect_pt_masked(struct kvm *kvm,
+					    struct kvm_memory_slot *slot,
+					    gfn_t gfn_offset, unsigned long mask)
+{
+	struct kvm_rmap_head *rmap_head;
+
+	while (mask) {
+		rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+					PG_LEVEL_4K, slot);
+		rmap_write_protect(rmap_head, false);
+
+		/* clear the first set bit */
+		mask &= mask - 1;
+	}
+}
+
+void kvm_shadow_mmu_clear_dirty_pt_masked(struct kvm *kvm,
+					  struct kvm_memory_slot *slot,
+					  gfn_t gfn_offset, unsigned long mask)
+{
+	struct kvm_rmap_head *rmap_head;
+
+	while (mask) {
+		rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+					PG_LEVEL_4K, slot);
+		__rmap_clear_dirty(kvm, rmap_head, slot);
+
+		/* clear the first set bit */
+		mask &= mask - 1;
+	}
+}
+
+bool kvm_shadow_mmu_write_protect_gfn(struct kvm *kvm,
+				      struct kvm_memory_slot *slot,
+				      u64 gfn, int min_level)
+{
+	struct kvm_rmap_head *rmap_head;
+	int i;
+	bool write_protected = false;
+
+	if (kvm_memslots_have_rmaps(kvm)) {
+		for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
+			rmap_head = gfn_to_rmap(gfn, i, slot);
+			write_protected |= rmap_write_protect(rmap_head, true);
+		}
+	}
+
+	return write_protected;
+}
+
+void kvm_shadow_mmu_clear_dirty_slot(struct kvm *kvm,
+				     const struct kvm_memory_slot *memslot)
+{
+	walk_slot_rmaps_4k(kvm, memslot, __rmap_clear_dirty, false);
+}
+
+void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
+				const struct kvm_memory_slot *memslot,
+				int start_level)
+{
+	walk_slot_rmaps(kvm, memslot, slot_rmap_write_protect,
+			start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 82eed9bb9bc9a..58f48293b4773 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -117,6 +117,21 @@ void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
 bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
 unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
 
+void kvm_shadow_mmu_write_protect_pt_masked(struct kvm *kvm,
+					    struct kvm_memory_slot *slot,
+					    gfn_t gfn_offset, unsigned long mask);
+void kvm_shadow_mmu_clear_dirty_pt_masked(struct kvm *kvm,
+					  struct kvm_memory_slot *slot,
+					  gfn_t gfn_offset, unsigned long mask);
+bool kvm_shadow_mmu_write_protect_gfn(struct kvm *kvm,
+				      struct kvm_memory_slot *slot,
+				      u64 gfn, int min_level);
+void kvm_shadow_mmu_clear_dirty_slot(struct kvm *kvm,
+				     const struct kvm_memory_slot *memslot);
+void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
+				const struct kvm_memory_slot *memslot,
+				int start_level);
+
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			  gpa_t vaddr, u64 access,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (12 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 14/21] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-04 14:44   ` kernel test robot
  2023-02-02 18:28 ` [PATCH 16/21] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Now that the various dirty logging / wrprot function implementations are
in shadow_mmu.c, do another round of cleanups to remove functions which
no longer need to be exposed and can be marked static.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/shadow_mmu.c | 32 +++++++++++++++++++-------------
 arch/x86/kvm/mmu/shadow_mmu.h | 18 ------------------
 2 files changed, 19 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index b93a6174717d3..dc5c4b9899cc6 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -634,8 +634,8 @@ unsigned int pte_list_count(struct kvm_rmap_head *rmap_head)
 	return count;
 }
 
-struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
-				  const struct kvm_memory_slot *slot)
+static struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
+					 const struct kvm_memory_slot *slot)
 {
 	unsigned long idx;
 
@@ -803,7 +803,7 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect)
 	return mmu_spte_update(sptep, spte);
 }
 
-bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect)
+static bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -842,8 +842,8 @@ static bool spte_wrprot_for_clear_dirty(u64 *sptep)
  *	- W bit on ad-disabled SPTEs.
  * Returns true iff any D or W bits were cleared.
  */
-bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-			const struct kvm_memory_slot *slot)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			       const struct kvm_memory_slot *slot)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -3057,6 +3057,11 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 	write_unlock(&vcpu->kvm->mmu_lock);
 }
 
+/* The return value indicates if tlb flush on all vcpus is needed. */
+typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
+				    struct kvm_rmap_head *rmap_head,
+				    const struct kvm_memory_slot *slot);
+
 static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
 					      const struct kvm_memory_slot *slot,
 					      slot_rmaps_handler fn,
@@ -3087,20 +3092,21 @@ static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
 	return flush;
 }
 
-__always_inline bool walk_slot_rmaps(struct kvm *kvm,
-				     const struct kvm_memory_slot *slot,
-				     slot_rmaps_handler fn, int start_level,
-				     int end_level, bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps(struct kvm *kvm,
+					    const struct kvm_memory_slot *slot,
+					    slot_rmaps_handler fn,
+					    int start_level, int end_level,
+					    bool flush_on_yield)
 {
 	return __walk_slot_rmaps(kvm, slot, fn, start_level, end_level,
 				 slot->base_gfn, slot->base_gfn + slot->npages - 1,
 				 flush_on_yield, false);
 }
 
-__always_inline bool walk_slot_rmaps_4k(struct kvm *kvm,
-					const struct kvm_memory_slot *slot,
-					slot_rmaps_handler fn,
-					bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps_4k(struct kvm *kvm,
+					       const struct kvm_memory_slot *slot,
+					       slot_rmaps_handler fn,
+					       bool flush_on_yield)
 {
 	return walk_slot_rmaps(kvm, slot, fn, PG_LEVEL_4K,
 			       PG_LEVEL_4K, flush_on_yield);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 58f48293b4773..36fe8013931d2 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -39,11 +39,6 @@ struct pte_list_desc {
 /* Only exported for debugfs.c. */
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
 
-struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
-				  const struct kvm_memory_slot *slot);
-bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect);
-bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-			const struct kvm_memory_slot *slot);
 bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 		  struct kvm_memory_slot *slot, gfn_t gfn, int level,
 		  pte_t unused);
@@ -91,22 +86,9 @@ int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 		       int bytes, struct kvm_page_track_notifier_node *node);
 
-/* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
-				    struct kvm_rmap_head *rmap_head,
-				    const struct kvm_memory_slot *slot);
-bool walk_slot_rmaps(struct kvm *kvm, const struct kvm_memory_slot *slot,
-		       slot_rmaps_handler fn, int start_level, int end_level,
-		       bool flush_on_yield);
-bool walk_slot_rmaps_4k(struct kvm *kvm, const struct kvm_memory_slot *slot,
-			slot_rmaps_handler fn, bool flush_on_yield);
-
 void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
 bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
 
-bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-			     const struct kvm_memory_slot *slot);
-
 void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
 					 const struct kvm_memory_slot *slot,
 					 gfn_t start, gfn_t end,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 16/21] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (13 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 17/21] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h Ben Gardon
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

handle_gfn_range + callback is not a bad interface, but it requires
exporting the whole callback scheme to mmu.c. Simplify the interface
with some basic wrapper functions, making the callback scheme internal
to shadow_mmu.c.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        |  8 +++---
 arch/x86/kvm/mmu/shadow_mmu.c | 54 +++++++++++++++++++++++++----------
 arch/x86/kvm/mmu/shadow_mmu.h | 25 ++++------------
 3 files changed, 48 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 44a00396284d5..156ab2e4cd811 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -490,7 +490,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 	bool flush = false;
 
 	if (kvm_memslots_have_rmaps(kvm))
-		flush = kvm_handle_gfn_range(kvm, range, kvm_zap_rmap);
+		flush = kvm_shadow_mmu_unmap_gfn_range(kvm, range);
 
 	if (tdp_mmu_enabled)
 		flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush);
@@ -503,7 +503,7 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	bool flush = false;
 
 	if (kvm_memslots_have_rmaps(kvm))
-		flush = kvm_handle_gfn_range(kvm, range, kvm_set_pte_rmap);
+		flush = kvm_shadow_mmu_set_spte_gfn(kvm, range);
 
 	if (tdp_mmu_enabled)
 		flush |= kvm_tdp_mmu_set_spte_gfn(kvm, range);
@@ -516,7 +516,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	bool young = false;
 
 	if (kvm_memslots_have_rmaps(kvm))
-		young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap);
+		young = kvm_shadow_mmu_age_gfn_range(kvm, range);
 
 	if (tdp_mmu_enabled)
 		young |= kvm_tdp_mmu_age_gfn_range(kvm, range);
@@ -529,7 +529,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	bool young = false;
 
 	if (kvm_memslots_have_rmaps(kvm))
-		young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap);
+		young = kvm_shadow_mmu_test_age_gfn(kvm, range);
 
 	if (tdp_mmu_enabled)
 		young |= kvm_tdp_mmu_test_age_gfn(kvm, range);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index dc5c4b9899cc6..dfff65db97c3b 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -864,16 +864,16 @@ static bool __kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	return kvm_zap_all_rmap_sptes(kvm, rmap_head);
 }
 
-bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		  struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		  pte_t unused)
+static bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			 struct kvm_memory_slot *slot, gfn_t gfn, int level,
+			 pte_t unused)
 {
 	return __kvm_zap_rmap(kvm, rmap_head, slot);
 }
 
-bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		      struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		      pte_t pte)
+static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			     struct kvm_memory_slot *slot, gfn_t gfn, int level,
+			     pte_t pte)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -980,9 +980,13 @@ static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
 	     slot_rmap_walk_okay(_iter_);				\
 	     slot_rmap_walk_next(_iter_))
 
-__always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
-					  struct kvm_gfn_range *range,
-					  rmap_handler_t handler)
+typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			       struct kvm_memory_slot *slot, gfn_t gfn,
+			       int level, pte_t pte);
+
+static __always_inline bool
+kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
+		     rmap_handler_t handler)
 {
 	struct slot_rmap_walk_iterator iterator;
 	bool ret = false;
@@ -995,9 +999,9 @@ __always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
 	return ret;
 }
 
-bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		  struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		  pte_t unused)
+static bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			 struct kvm_memory_slot *slot, gfn_t gfn, int level,
+			 pte_t unused)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -1009,9 +1013,9 @@ bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	return young;
 }
 
-bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		       struct kvm_memory_slot *slot, gfn_t gfn,
-		       int level, pte_t unused)
+static bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+			      struct kvm_memory_slot *slot, gfn_t gfn,
+			      int level, pte_t unused)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -3523,3 +3527,23 @@ void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
 	walk_slot_rmaps(kvm, memslot, slot_rmap_write_protect,
 			start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
 }
+
+bool kvm_shadow_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+	return kvm_handle_gfn_range(kvm, range, kvm_zap_rmap);
+}
+
+bool kvm_shadow_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+	return kvm_handle_gfn_range(kvm, range, kvm_set_pte_rmap);
+}
+
+bool kvm_shadow_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+	return kvm_handle_gfn_range(kvm, range, kvm_age_rmap);
+}
+
+bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+	return kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 36fe8013931d2..e4fbc842f524e 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -39,26 +39,6 @@ struct pte_list_desc {
 /* Only exported for debugfs.c. */
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
 
-bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		  struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		  pte_t unused);
-bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		      struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		      pte_t pte);
-
-typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-			       struct kvm_memory_slot *slot, gfn_t gfn,
-			       int level, pte_t pte);
-bool kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
-			  rmap_handler_t handler);
-
-bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		  struct kvm_memory_slot *slot, gfn_t gfn, int level,
-		  pte_t unused);
-bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-		       struct kvm_memory_slot *slot, gfn_t gfn,
-		       int level, pte_t unused);
-
 void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
 
 bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
@@ -114,6 +94,11 @@ void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
 				const struct kvm_memory_slot *memslot,
 				int start_level);
 
+bool kvm_shadow_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
+
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			  gpa_t vaddr, u64 access,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 17/21] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (14 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 16/21] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 18/21] KVM: x86/mmu: Move split cache topup functions to shadow_mmu.c Ben Gardon
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Fix up the names of the last few Shadow MMU functions in shadow_mmu.h.
This gives a clean and obvious interface between the shared x86 MMU
code and the Shadow MMU. There are still a few functions exported from
paging_tmpl.h that are left as-is, but changing those will need to be
done separately, if at all.

No functional change intended.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 19 ++++++++-------
 arch/x86/kvm/mmu/shadow_mmu.c | 44 +++++++++++++++++++----------------
 arch/x86/kvm/mmu/shadow_mmu.h | 16 +++++++------
 3 files changed, 43 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 156ab2e4cd811..f5b9db00eff99 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -884,7 +884,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		if (tdp_mmu_enabled)
 			sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
 		else
-			sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
+			sptep = kvm_shadow_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
 
 		if (!is_shadow_present_pte(spte))
 			break;
@@ -1073,7 +1073,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 		root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
 		mmu->root.hpa = root;
 	} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
-		root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level);
+		root = kvm_shadow_mmu_alloc_root(vcpu, 0, 0, shadow_root_level);
 		mmu->root.hpa = root;
 	} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
 		if (WARN_ON_ONCE(!mmu->pae_root)) {
@@ -1084,8 +1084,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 		for (i = 0; i < 4; ++i) {
 			WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i]));
 
-			root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT), 0,
-					      PT32_ROOT_LEVEL);
+			root = kvm_shadow_mmu_alloc_root(vcpu,
+					i << (30 - PAGE_SHIFT), 0, PT32_ROOT_LEVEL);
 			mmu->pae_root[i] = root | PT_PRESENT_MASK |
 					   shadow_me_value;
 		}
@@ -1663,7 +1663,7 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
 	 * count. Otherwise, clear the write flooding count.
 	 */
 	if (!new_role.direct)
-		__clear_sp_write_flooding_count(
+		kvm_shadow_mmu_clear_sp_write_flooding_count(
 				to_shadow_page(vcpu->arch.mmu->root.hpa));
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
@@ -2439,13 +2439,13 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 	r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct);
 	if (r)
 		goto out;
-	r = mmu_alloc_special_roots(vcpu);
+	r = kvm_shadow_mmu_alloc_special_roots(vcpu);
 	if (r)
 		goto out;
 	if (vcpu->arch.mmu->root_role.direct)
 		r = mmu_alloc_direct_roots(vcpu);
 	else
-		r = mmu_alloc_shadow_roots(vcpu);
+		r = kvm_shadow_mmu_alloc_shadow_roots(vcpu);
 	if (r)
 		goto out;
 
@@ -2674,7 +2674,8 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
 	 * generally doesn't use PAE paging and can skip allocating the PDP
 	 * table.  The main exception, handled here, is SVM's 32-bit NPT.  The
 	 * other exception is for shadowing L1's 32-bit or PAE NPT on 64-bit
-	 * KVM; that horror is handled on-demand by mmu_alloc_special_roots().
+	 * KVM; that horror is handled on-demand by
+	 * kvm_shadow_mmu_alloc_special_roots().
 	 */
 	if (tdp_enabled && kvm_mmu_get_tdp_level(vcpu) > PT32E_ROOT_LEVEL)
 		return 0;
@@ -2817,7 +2818,7 @@ int kvm_mmu_init_vm(struct kvm *kvm)
 			return r;
 	}
 
-	node->track_write = kvm_mmu_pte_write;
+	node->track_write = kvm_shadow_mmu_pte_write;
 	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
 	kvm_page_track_register_notifier(kvm, node);
 
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index dfff65db97c3b..eb4424fedd73a 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1404,14 +1404,14 @@ static int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
 	return 0;
 }
 
-void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
+void kvm_shadow_mmu_clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
 {
 	atomic_set(&sp->write_flooding_count,  0);
 }
 
 static void clear_sp_write_flooding_count(u64 *spte)
 {
-	__clear_sp_write_flooding_count(sptep_to_sp(spte));
+	kvm_shadow_mmu_clear_sp_write_flooding_count(sptep_to_sp(spte));
 }
 
 /*
@@ -1482,7 +1482,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
 				kvm_flush_remote_tlbs(kvm);
 		}
 
-		__clear_sp_write_flooding_count(sp);
+		kvm_shadow_mmu_clear_sp_write_flooding_count(sp);
 
 		goto out;
 	}
@@ -1607,12 +1607,13 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct,
 	 * Concretely, a 4-byte PDE consumes bits 31:22, while an 8-byte PDE
 	 * consumes bits 29:21.  To consume bits 31:30, KVM's uses 4 shadow
 	 * PDPTEs; those 4 PAE page directories are pre-allocated and their
-	 * quadrant is assigned in mmu_alloc_root().   A 4-byte PTE consumes
-	 * bits 21:12, while an 8-byte PTE consumes bits 20:12.  To consume
-	 * bit 21 in the PTE (the child here), KVM propagates that bit to the
-	 * quadrant, i.e. sets quadrant to '0' or '1'.  The parent 8-byte PDE
-	 * covers bit 21 (see above), thus the quadrant is calculated from the
-	 * _least_ significant bit of the PDE index.
+	 * quadrant is assigned in kvm_shadow_mmu_alloc_root().
+	 * A 4-byte PTE consumes bits 21:12, while an 8-byte PTE consumes
+	 * bits 20:12.  To consume bit 21 in the PTE (the child here), KVM
+	 * propagates that bit to the quadrant, i.e. sets quadrant to
+	 * '0' or '1'.  The parent 8-byte PDE covers bit 21 (see above), thus
+	 * the quadrant is calculated from the _least_ significant bit of the
+	 * PDE index.
 	 */
 	if (role.has_4_byte_gpte) {
 		WARN_ON_ONCE(role.level != PG_LEVEL_4K);
@@ -2389,7 +2390,8 @@ int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *faul
  *  - Must be called between walk_shadow_page_lockless_{begin,end}.
  *  - The returned sptep must not be used after walk_shadow_page_lockless_end.
  */
-u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte)
+u64 *kvm_shadow_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa,
+					   u64 *spte)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	u64 old_spte;
@@ -2442,7 +2444,8 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
 	return ret;
 }
 
-hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level)
+hpa_t kvm_shadow_mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant,
+				u8 level)
 {
 	union kvm_mmu_page_role role = vcpu->arch.mmu->root_role;
 	struct kvm_mmu_page *sp;
@@ -2459,7 +2462,7 @@ hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level)
 	return __pa(sp->spt);
 }
 
-static int mmu_first_shadow_root_alloc(struct kvm *kvm)
+static int kvm_shadow_mmu_first_shadow_root_alloc(struct kvm *kvm)
 {
 	struct kvm_memslots *slots;
 	struct kvm_memory_slot *slot;
@@ -2520,7 +2523,7 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm)
 	return r;
 }
 
-int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	u64 pdptrs[4], pm_mask;
@@ -2549,7 +2552,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	r = mmu_first_shadow_root_alloc(vcpu->kvm);
+	r = kvm_shadow_mmu_first_shadow_root_alloc(vcpu->kvm);
 	if (r)
 		return r;
 
@@ -2563,8 +2566,8 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	 * write-protect the guests page table root.
 	 */
 	if (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
-		root = mmu_alloc_root(vcpu, root_gfn, 0,
-				      mmu->root_role.level);
+		root = kvm_shadow_mmu_alloc_root(vcpu, root_gfn, 0,
+						 mmu->root_role.level);
 		mmu->root.hpa = root;
 		goto set_root_pgd;
 	}
@@ -2617,7 +2620,8 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		 */
 		quadrant = (mmu->cpu_role.base.level == PT32_ROOT_LEVEL) ? i : 0;
 
-		root = mmu_alloc_root(vcpu, root_gfn, quadrant, PT32_ROOT_LEVEL);
+		root = kvm_shadow_mmu_alloc_root(vcpu, root_gfn, quadrant,
+						 PT32_ROOT_LEVEL);
 		mmu->pae_root[i] = root | pm_mask;
 	}
 
@@ -2636,7 +2640,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	bool need_pml5 = mmu->root_role.level > PT64_ROOT_4LEVEL;
@@ -3009,8 +3013,8 @@ static u64 *get_written_sptes(struct kvm_mmu_page *sp, gpa_t gpa, int *nspte)
 	return spte;
 }
 
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
-		       int bytes, struct kvm_page_track_notifier_node *node)
+void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+			      int bytes, struct kvm_page_track_notifier_node *node)
 {
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	struct kvm_mmu_page *sp;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index e4fbc842f524e..4d39017873aa6 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -39,7 +39,7 @@ struct pte_list_desc {
 /* Only exported for debugfs.c. */
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
 
-void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
+void kvm_shadow_mmu_clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
 
 bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 				       struct list_head *invalid_list,
@@ -54,17 +54,19 @@ int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
 int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 
 int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
-u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
+u64 *kvm_shadow_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa,
+					   u64 *spte);
 
-hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level);
-int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
-int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
+hpa_t kvm_shadow_mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant,
+				u8 level);
+int kvm_shadow_mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
+int kvm_shadow_mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
 
 int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 			    int *root_level);
 
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
-		       int bytes, struct kvm_page_track_notifier_node *node);
+void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+			      int bytes, struct kvm_page_track_notifier_node *node);
 
 void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
 bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 18/21] KVM: x86/mmu: Move split cache topup functions to shadow_mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (15 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 17/21] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 19/21] KVM: x86/mmu: Move Shadow MMU part of kvm_mmu_zap_all() to shadow_mmu.h Ben Gardon
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

The split cache topup functions are only used by the Shadow MMU and were
left behind in mmu.c when splitting the Shadow MMU out to a separate
file. Move them over as well.

No functional change intended.

Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 53 ---------------------------------
 arch/x86/kvm/mmu/mmu_internal.h |  2 --
 arch/x86/kvm/mmu/shadow_mmu.c   | 53 +++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+), 55 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f5b9db00eff99..8514e998e2127 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2902,59 +2902,6 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 	}
 }
 
-static inline bool need_topup(struct kvm_mmu_memory_cache *cache, int min)
-{
-	return kvm_mmu_memory_cache_nr_free_objects(cache) < min;
-}
-
-bool need_topup_split_caches_or_resched(struct kvm *kvm)
-{
-	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
-		return true;
-
-	/*
-	 * In the worst case, SPLIT_DESC_CACHE_MIN_NR_OBJECTS descriptors are needed
-	 * to split a single huge page. Calculating how many are actually needed
-	 * is possible but not worth the complexity.
-	 */
-	return need_topup(&kvm->arch.split_desc_cache, SPLIT_DESC_CACHE_MIN_NR_OBJECTS) ||
-	       need_topup(&kvm->arch.split_page_header_cache, 1) ||
-	       need_topup(&kvm->arch.split_shadow_page_cache, 1);
-}
-
-int topup_split_caches(struct kvm *kvm)
-{
-	/*
-	 * Allocating rmap list entries when splitting huge pages for nested
-	 * MMUs is uncommon as KVM needs to use a list if and only if there is
-	 * more than one rmap entry for a gfn, i.e. requires an L1 gfn to be
-	 * aliased by multiple L2 gfns and/or from multiple nested roots with
-	 * different roles.  Aliasing gfns when using TDP is atypical for VMMs;
-	 * a few gfns are often aliased during boot, e.g. when remapping BIOS,
-	 * but aliasing rarely occurs post-boot or for many gfns.  If there is
-	 * only one rmap entry, rmap->val points directly at that one entry and
-	 * doesn't need to allocate a list.  Buffer the cache by the default
-	 * capacity so that KVM doesn't have to drop mmu_lock to topup if KVM
-	 * encounters an aliased gfn or two.
-	 */
-	const int capacity = SPLIT_DESC_CACHE_MIN_NR_OBJECTS +
-			     KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
-	int r;
-
-	lockdep_assert_held(&kvm->slots_lock);
-
-	r = __kvm_mmu_topup_memory_cache(&kvm->arch.split_desc_cache, capacity,
-					 SPLIT_DESC_CACHE_MIN_NR_OBJECTS);
-	if (r)
-		return r;
-
-	r = kvm_mmu_topup_memory_cache(&kvm->arch.split_page_header_cache, 1);
-	if (r)
-		return r;
-
-	return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
-}
-
 /* Must be called with the mmu_lock held in write-mode. */
 void kvm_mmu_try_split_huge_pages(struct kvm *kvm,
 				   const struct kvm_memory_slot *memslot,
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 349d4a300ad34..2273c6263faf0 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -348,8 +348,6 @@ void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu);
 void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu);
 
 int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect);
-bool need_topup_split_caches_or_resched(struct kvm *kvm);
-int topup_split_caches(struct kvm *kvm);
 
 bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index eb4424fedd73a..bb23692d34a73 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3219,6 +3219,59 @@ bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	return rmap_write_protect(rmap_head, false);
 }
 
+static inline bool need_topup(struct kvm_mmu_memory_cache *cache, int min)
+{
+	return kvm_mmu_memory_cache_nr_free_objects(cache) < min;
+}
+
+static bool need_topup_split_caches_or_resched(struct kvm *kvm)
+{
+	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
+		return true;
+
+	/*
+	 * In the worst case, SPLIT_DESC_CACHE_MIN_NR_OBJECTS descriptors are needed
+	 * to split a single huge page. Calculating how many are actually needed
+	 * is possible but not worth the complexity.
+	 */
+	return need_topup(&kvm->arch.split_desc_cache, SPLIT_DESC_CACHE_MIN_NR_OBJECTS) ||
+	       need_topup(&kvm->arch.split_page_header_cache, 1) ||
+	       need_topup(&kvm->arch.split_shadow_page_cache, 1);
+}
+
+static int topup_split_caches(struct kvm *kvm)
+{
+	/*
+	 * Allocating rmap list entries when splitting huge pages for nested
+	 * MMUs is uncommon as KVM needs to use a list if and only if there is
+	 * more than one rmap entry for a gfn, i.e. requires an L1 gfn to be
+	 * aliased by multiple L2 gfns and/or from multiple nested roots with
+	 * different roles.  Aliasing gfns when using TDP is atypical for VMMs;
+	 * a few gfns are often aliased during boot, e.g. when remapping BIOS,
+	 * but aliasing rarely occurs post-boot or for many gfns.  If there is
+	 * only one rmap entry, rmap->val points directly at that one entry and
+	 * doesn't need to allocate a list.  Buffer the cache by the default
+	 * capacity so that KVM doesn't have to drop mmu_lock to topup if KVM
+	 * encounters an aliased gfn or two.
+	 */
+	const int capacity = SPLIT_DESC_CACHE_MIN_NR_OBJECTS +
+			     KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
+	int r;
+
+	lockdep_assert_held(&kvm->slots_lock);
+
+	r = __kvm_mmu_topup_memory_cache(&kvm->arch.split_desc_cache, capacity,
+					 SPLIT_DESC_CACHE_MIN_NR_OBJECTS);
+	if (r)
+		return r;
+
+	r = kvm_mmu_topup_memory_cache(&kvm->arch.split_page_header_cache, 1);
+	if (r)
+		return r;
+
+	return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
+}
+
 static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u64 *huge_sptep)
 {
 	struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep);
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 19/21] KVM: x86/mmu: Move Shadow MMU part of kvm_mmu_zap_all() to shadow_mmu.h
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (16 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 18/21] KVM: x86/mmu: Move split cache topup functions to shadow_mmu.c Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 20/21] KVM: x86/mmu: Move Shadow MMU init/teardown to shadow_mmu.c Ben Gardon
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Move the Shadow MMU part of kvm_mmu_zap_all() into a helper function in
shadow_mmu.h. Also check kvm_memslots_have_rmaps so the Shadow MMU
operation can be skipped entierly if it's not needed. This could present
an opportuinity to move the TDP MMU portion of the function under the
MMU lock in read mode, but since zapping all paging structures should be
a very rare and thus not a perfromance sensitive operation, it's not
necessary.

Suggested-by: David Matlack <dmatlack@google.com>

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 17 ++---------------
 arch/x86/kvm/mmu/shadow_mmu.c | 19 +++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h |  2 ++
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8514e998e2127..63b928bded9d1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3011,22 +3011,9 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 
 void kvm_mmu_zap_all(struct kvm *kvm)
 {
-	struct kvm_mmu_page *sp, *node;
-	LIST_HEAD(invalid_list);
-	int ign;
-
 	write_lock(&kvm->mmu_lock);
-restart:
-	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
-		if (WARN_ON(sp->role.invalid))
-			continue;
-		if (__kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
-			goto restart;
-		if (cond_resched_rwlock_write(&kvm->mmu_lock))
-			goto restart;
-	}
-
-	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
+	if (kvm_memslots_have_rmaps(kvm))
+		kvm_shadow_mmu_zap_all(kvm);
 
 	if (tdp_mmu_enabled)
 		kvm_tdp_mmu_zap_all(kvm);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index bb23692d34a73..c6d3da795992e 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3604,3 +3604,22 @@ bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 {
 	return kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap);
 }
+
+void kvm_shadow_mmu_zap_all(struct kvm *kvm)
+{
+	struct kvm_mmu_page *sp, *node;
+	LIST_HEAD(invalid_list);
+	int ign;
+
+restart:
+	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
+		if (WARN_ON(sp->role.invalid))
+			continue;
+		if (__kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
+			goto restart;
+		if (cond_resched_rwlock_write(&kvm->mmu_lock))
+			goto restart;
+	}
+
+	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 4d39017873aa6..ab01636373bda 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -101,6 +101,8 @@ bool kvm_shadow_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
 bool kvm_shadow_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
 bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
 
+void kvm_shadow_mmu_zap_all(struct kvm *kvm);
+
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			  gpa_t vaddr, u64 access,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 20/21] KVM: x86/mmu: Move Shadow MMU init/teardown to shadow_mmu.c
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (17 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 19/21] KVM: x86/mmu: Move Shadow MMU part of kvm_mmu_zap_all() to shadow_mmu.h Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-02-02 18:28 ` [PATCH 21/21] KVM: x86/mmu: Split out Shadow MMU lockless walk begin/end Ben Gardon
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Move the meat of kvm_mmu_init_vm() and kvm_mmu_uninit_vm() pertaining to
the Shadow MMU to shadow_mmu.c.

Suggested-by: David Matlack <dmatlack@google.com>

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 41 +++---------------------------
 arch/x86/kvm/mmu/mmu_internal.h |  2 ++
 arch/x86/kvm/mmu/shadow_mmu.c   | 44 +++++++++++++++++++++++++++++++--
 arch/x86/kvm/mmu/shadow_mmu.h   |  6 ++---
 4 files changed, 51 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 63b928bded9d1..10aff23dea75d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2743,7 +2743,7 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
  * not use any resource of the being-deleted slot or all slots
  * after calling the function.
  */
-static void kvm_mmu_zap_all_fast(struct kvm *kvm)
+void kvm_mmu_zap_all_fast(struct kvm *kvm)
 {
 	lockdep_assert_held(&kvm->slots_lock);
 
@@ -2795,22 +2795,13 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 		kvm_tdp_mmu_zap_invalidated_roots(kvm);
 }
 
-static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
-			struct kvm_memory_slot *slot,
-			struct kvm_page_track_notifier_node *node)
-{
-	kvm_mmu_zap_all_fast(kvm);
-}
-
 int kvm_mmu_init_vm(struct kvm *kvm)
 {
-	struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
 	int r;
 
-	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
-	INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
 	INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages);
-	spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
+
+	kvm_mmu_init_shadow_mmu(kvm);
 
 	if (tdp_mmu_enabled) {
 		r = kvm_mmu_init_tdp_mmu(kvm);
@@ -2818,38 +2809,14 @@ int kvm_mmu_init_vm(struct kvm *kvm)
 			return r;
 	}
 
-	node->track_write = kvm_shadow_mmu_pte_write;
-	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
-	kvm_page_track_register_notifier(kvm, node);
-
-	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
-	kvm->arch.split_page_header_cache.gfp_zero = __GFP_ZERO;
-
-	kvm->arch.split_shadow_page_cache.gfp_zero = __GFP_ZERO;
-
-	kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache;
-	kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO;
-
 	return 0;
 }
 
-static void mmu_free_vm_memory_caches(struct kvm *kvm)
-{
-	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
-	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
-	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
-}
-
 void kvm_mmu_uninit_vm(struct kvm *kvm)
 {
-	struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
-
-	kvm_page_track_unregister_notifier(kvm, node);
-
+	kvm_mmu_uninit_shadow_mmu(kvm);
 	if (tdp_mmu_enabled)
 		kvm_mmu_uninit_tdp_mmu(kvm);
-
-	mmu_free_vm_memory_caches(kvm);
 }
 
 /*
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 2273c6263faf0..c49d302b037ec 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -406,4 +406,6 @@ BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pke);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, la57);
 BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
 BUILD_MMU_ROLE_ACCESSOR(ext,  efer, lma);
+
+void kvm_mmu_zap_all_fast(struct kvm *kvm);
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index c6d3da795992e..6449ac4de4883 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3013,8 +3013,9 @@ static u64 *get_written_sptes(struct kvm_mmu_page *sp, gpa_t gpa, int *nspte)
 	return spte;
 }
 
-void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
-			      int bytes, struct kvm_page_track_notifier_node *node)
+static void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
+				     const u8 *new, int bytes,
+				     struct kvm_page_track_notifier_node *node)
 {
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	struct kvm_mmu_page *sp;
@@ -3623,3 +3624,42 @@ void kvm_shadow_mmu_zap_all(struct kvm *kvm)
 
 	kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
 }
+
+static void kvm_shadow_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
+			struct kvm_memory_slot *slot,
+			struct kvm_page_track_notifier_node *node)
+{
+	kvm_mmu_zap_all_fast(kvm);
+}
+
+void kvm_mmu_init_shadow_mmu(struct kvm *kvm)
+{
+	struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
+
+	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+	INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
+	spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
+
+	node->track_write = kvm_shadow_mmu_pte_write;
+	node->track_flush_slot = kvm_shadow_mmu_invalidate_zap_pages_in_memslot;
+	kvm_page_track_register_notifier(kvm, node);
+
+	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
+	kvm->arch.split_page_header_cache.gfp_zero = __GFP_ZERO;
+
+	kvm->arch.split_shadow_page_cache.gfp_zero = __GFP_ZERO;
+
+	kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache;
+	kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO;
+}
+
+void kvm_mmu_uninit_shadow_mmu(struct kvm *kvm)
+{
+	struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
+
+	kvm_page_track_unregister_notifier(kvm, node);
+
+	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
+	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
+	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index ab01636373bda..f2e54355ebb19 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -65,9 +65,6 @@ int kvm_shadow_mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
 int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 			    int *root_level);
 
-void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
-			      int bytes, struct kvm_page_track_notifier_node *node);
-
 void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
 bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
 
@@ -103,6 +100,9 @@ bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
 
 void kvm_shadow_mmu_zap_all(struct kvm *kvm);
 
+void kvm_mmu_init_shadow_mmu(struct kvm *kvm);
+void kvm_mmu_uninit_shadow_mmu(struct kvm *kvm);
+
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			  gpa_t vaddr, u64 access,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 21/21] KVM: x86/mmu: Split out Shadow MMU lockless walk begin/end
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (18 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 20/21] KVM: x86/mmu: Move Shadow MMU init/teardown to shadow_mmu.c Ben Gardon
@ 2023-02-02 18:28 ` Ben Gardon
  2023-03-20 19:09 ` [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Sean Christopherson
  2023-03-23 23:02 ` Sean Christopherson
  21 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
	Vipin Sharma, Ricardo Koller, Ben Gardon

Split out the meat of kvm_shadow_mmu_walk_lockless_begin/end() to
functions in shadow_mmu.c since there's no need for it in the common MMU
code.

Suggested-by: David Matlack <dmatlack@google.com>

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c        | 31 ++++++-------------------------
 arch/x86/kvm/mmu/shadow_mmu.c | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h |  3 +++
 3 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 10aff23dea75d..cfccc4c7a1427 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -207,37 +207,18 @@ static inline bool is_tdp_mmu_active(struct kvm_vcpu *vcpu)
 
 void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 {
-	if (is_tdp_mmu_active(vcpu)) {
+	if (is_tdp_mmu_active(vcpu))
 		kvm_tdp_mmu_walk_lockless_begin();
-	} else {
-		/*
-		 * Prevent page table teardown by making any free-er wait during
-		 * kvm_flush_remote_tlbs() IPI to all active vcpus.
-		 */
-		local_irq_disable();
-
-		/*
-		 * Make sure a following spte read is not reordered ahead of the write
-		 * to vcpu->mode.
-		 */
-		smp_store_mb(vcpu->mode, READING_SHADOW_PAGE_TABLES);
-	}
+	else
+		kvm_shadow_mmu_walk_lockless_begin(vcpu);
 }
 
 void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 {
-	if (is_tdp_mmu_active(vcpu)) {
+	if (is_tdp_mmu_active(vcpu))
 		kvm_tdp_mmu_walk_lockless_end();
-	} else {
-		/*
-		 * Make sure the write to vcpu->mode is not reordered in front
-		 * of reads to sptes.  If it does,
-		 * kvm_shadow_mmu_commit_zap_page() can see us
-		 * OUTSIDE_GUEST_MODE and proceed to free the shadow page table.
-		 */
-		smp_store_release(&vcpu->mode, OUTSIDE_GUEST_MODE);
-		local_irq_enable();
-	}
+	else
+		kvm_shadow_mmu_walk_lockless_end(vcpu);
 }
 
 int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 6449ac4de4883..c5d0accd6e057 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3663,3 +3663,30 @@ void kvm_mmu_uninit_shadow_mmu(struct kvm *kvm)
 	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
 	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
 }
+
+void kvm_shadow_mmu_walk_lockless_begin(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Prevent page table teardown by making any free-er wait during
+	 * kvm_flush_remote_tlbs() IPI to all active vcpus.
+	 */
+	local_irq_disable();
+
+	/*
+	 * Make sure a following spte read is not reordered ahead of the write
+	 * to vcpu->mode.
+	 */
+	smp_store_mb(vcpu->mode, READING_SHADOW_PAGE_TABLES);
+}
+
+void kvm_shadow_mmu_walk_lockless_end(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Make sure the write to vcpu->mode is not reordered in front
+	 * of reads to sptes.  If it does,
+	 * kvm_shadow_mmu_commit_zap_page() can see us
+	 * OUTSIDE_GUEST_MODE and proceed to free the shadow page table.
+	 */
+	smp_store_release(&vcpu->mode, OUTSIDE_GUEST_MODE);
+	local_irq_enable();
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index f2e54355ebb19..12835872bda34 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -103,6 +103,9 @@ void kvm_shadow_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_init_shadow_mmu(struct kvm *kvm);
 void kvm_mmu_uninit_shadow_mmu(struct kvm *kvm);
 
+void kvm_shadow_mmu_walk_lockless_begin(struct kvm_vcpu *vcpu);
+void kvm_shadow_mmu_walk_lockless_end(struct kvm_vcpu *vcpu);
+
 /* Exports from paging_tmpl.h */
 gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			  gpa_t vaddr, u64 access,
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
  2023-02-02 18:27 ` [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
@ 2023-02-04 13:52   ` kernel test robot
  2023-02-04 19:10   ` kernel test robot
  1 sibling, 0 replies; 29+ messages in thread
From: kernel test robot @ 2023-02-04 13:52 UTC (permalink / raw)
  To: Ben Gardon, linux-kernel, kvm
  Cc: oe-kbuild-all, Paolo Bonzini, Peter Xu, Sean Christopherson,
	David Matlack, Vipin Sharma, Ricardo Koller, Ben Gardon

Hi Ben,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f]

url:    https://github.com/intel-lab-lkp/linux/commits/Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
base:   7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f
patch link:    https://lore.kernel.org/r/20230202182809.1929122-12-bgardon%40google.com
patch subject: [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
config: x86_64-randconfig-a002 (https://download.01.org/0day-ci/archive/20230204/202302042127.gbKNDVaL-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/c1170de906dfe1ee64da0066e7c28d35e716ed18
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
        git checkout c1170de906dfe1ee64da0066e7c28d35e716ed18
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/kvm/mmu/mmu.c:3148:15: warning: no previous prototype for 'mmu_shrink_scan' [-Wmissing-prototypes]
    3148 | unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
         |               ^~~~~~~~~~~~~~~


vim +/mmu_shrink_scan +3148 arch/x86/kvm/mmu/mmu.c

  3147	
> 3148	unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
  3149	{
  3150		struct kvm *kvm;
  3151		int nr_to_scan = sc->nr_to_scan;
  3152		unsigned long freed = 0;
  3153	
  3154		mutex_lock(&kvm_lock);
  3155	
  3156		list_for_each_entry(kvm, &vm_list, vm_list) {
  3157			/*
  3158			 * Never scan more than sc->nr_to_scan VM instances.
  3159			 * Will not hit this condition practically since we do not try
  3160			 * to shrink more than one VM and it is very unlikely to see
  3161			 * !n_used_mmu_pages so many times.
  3162			 */
  3163			if (!nr_to_scan--)
  3164				break;
  3165	
  3166			/*
  3167			 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
  3168			 * here. We may skip a VM instance errorneosly, but we do not
  3169			 * want to shrink a VM that only started to populate its MMU
  3170			 * anyway.
  3171			 */
  3172			if (!kvm->arch.n_used_mmu_pages &&
  3173			    !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
  3174				continue;
  3175	
  3176			freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
  3177	
  3178			/*
  3179			 * unfair on small ones
  3180			 * per-vm shrinkers cry out
  3181			 * sadness comes quickly
  3182			 */
  3183			list_move_tail(&kvm->vm_list, &vm_list);
  3184			break;
  3185		}
  3186	
  3187		mutex_unlock(&kvm_lock);
  3188		return freed;
  3189	}
  3190	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
  2023-02-02 18:28 ` [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
@ 2023-02-04 14:44   ` kernel test robot
  0 siblings, 0 replies; 29+ messages in thread
From: kernel test robot @ 2023-02-04 14:44 UTC (permalink / raw)
  To: Ben Gardon, linux-kernel, kvm
  Cc: oe-kbuild-all, Paolo Bonzini, Peter Xu, Sean Christopherson,
	David Matlack, Vipin Sharma, Ricardo Koller, Ben Gardon

Hi Ben,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f]

url:    https://github.com/intel-lab-lkp/linux/commits/Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
base:   7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f
patch link:    https://lore.kernel.org/r/20230202182809.1929122-16-bgardon%40google.com
patch subject: [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
config: x86_64-randconfig-a002 (https://download.01.org/0day-ci/archive/20230204/202302042238.bWSFU5Qg-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/133035de32b4ef8bb8e3868c78dbdb3b807f04f4
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
        git checkout 133035de32b4ef8bb8e3868c78dbdb3b807f04f4
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/kvm/mmu/shadow_mmu.c:3208:6: warning: no previous prototype for 'slot_rmap_write_protect' [-Wmissing-prototypes]
    3208 | bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
         |      ^~~~~~~~~~~~~~~~~~~~~~~


vim +/slot_rmap_write_protect +3208 arch/x86/kvm/mmu/shadow_mmu.c

306e405e1b7fe70 Ben Gardon 2023-02-02  3207  
306e405e1b7fe70 Ben Gardon 2023-02-02 @3208  bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
306e405e1b7fe70 Ben Gardon 2023-02-02  3209  			     const struct kvm_memory_slot *slot)
306e405e1b7fe70 Ben Gardon 2023-02-02  3210  {
306e405e1b7fe70 Ben Gardon 2023-02-02  3211  	return rmap_write_protect(rmap_head, false);
306e405e1b7fe70 Ben Gardon 2023-02-02  3212  }
306e405e1b7fe70 Ben Gardon 2023-02-02  3213  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
  2023-02-02 18:27 ` [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
  2023-02-04 13:52   ` kernel test robot
@ 2023-02-04 19:10   ` kernel test robot
  1 sibling, 0 replies; 29+ messages in thread
From: kernel test robot @ 2023-02-04 19:10 UTC (permalink / raw)
  To: Ben Gardon, linux-kernel, kvm
  Cc: llvm, oe-kbuild-all, Paolo Bonzini, Peter Xu,
	Sean Christopherson, David Matlack, Vipin Sharma, Ricardo Koller,
	Ben Gardon

Hi Ben,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f]

url:    https://github.com/intel-lab-lkp/linux/commits/Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
base:   7cb79f433e75b05d1635aefaa851cfcd1cb7dc4f
patch link:    https://lore.kernel.org/r/20230202182809.1929122-12-bgardon%40google.com
patch subject: [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
config: x86_64-randconfig-a014 (https://download.01.org/0day-ci/archive/20230205/202302050303.kLjMucYK-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/c1170de906dfe1ee64da0066e7c28d35e716ed18
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ben-Gardon/KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-and-clean-up-code/20230203-023259
        git checkout c1170de906dfe1ee64da0066e7c28d35e716ed18
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/kvm/mmu/mmu.c:3148:15: warning: no previous prototype for function 'mmu_shrink_scan' [-Wmissing-prototypes]
   unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
                 ^
   arch/x86/kvm/mmu/mmu.c:3148:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
   ^
   static 
   1 warning generated.


vim +/mmu_shrink_scan +3148 arch/x86/kvm/mmu/mmu.c

  3147	
> 3148	unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
  3149	{
  3150		struct kvm *kvm;
  3151		int nr_to_scan = sc->nr_to_scan;
  3152		unsigned long freed = 0;
  3153	
  3154		mutex_lock(&kvm_lock);
  3155	
  3156		list_for_each_entry(kvm, &vm_list, vm_list) {
  3157			/*
  3158			 * Never scan more than sc->nr_to_scan VM instances.
  3159			 * Will not hit this condition practically since we do not try
  3160			 * to shrink more than one VM and it is very unlikely to see
  3161			 * !n_used_mmu_pages so many times.
  3162			 */
  3163			if (!nr_to_scan--)
  3164				break;
  3165	
  3166			/*
  3167			 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
  3168			 * here. We may skip a VM instance errorneosly, but we do not
  3169			 * want to shrink a VM that only started to populate its MMU
  3170			 * anyway.
  3171			 */
  3172			if (!kvm->arch.n_used_mmu_pages &&
  3173			    !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
  3174				continue;
  3175	
  3176			freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
  3177	
  3178			/*
  3179			 * unfair on small ones
  3180			 * per-vm shrinkers cry out
  3181			 * sadness comes quickly
  3182			 */
  3183			list_move_tail(&kvm->vm_list, &vm_list);
  3184			break;
  3185		}
  3186	
  3187		mutex_unlock(&kvm_lock);
  3188		return freed;
  3189	}
  3190	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
  2023-02-02 18:27 ` [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
@ 2023-03-20 18:41   ` Sean Christopherson
  2023-03-21 18:43     ` Ben Gardon
  0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2023-03-20 18:41 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Vipin Sharma, Ricardo Koller

First off, I apologize for not giving this feedback in the RFC.  I didn't think
too hard about the impliciations of moving paging_tmpl.h until I actually looked
at the code.

On Thu, Feb 02, 2023, Ben Gardon wrote:
> Move the integration point for paging_tmpl.h to shadow_mmu.c since
> paging_tmpl.h is ostensibly part of the Shadow MMU.

Ostensibly indeed.  While a simple majority of paging_tmpl.h is indeed unique to
the shadow MMU, all of the guest walker code needs to exist independent of the
shadow MMU.  And that code is signficant both in terms of lines of code, and
more importantly in terms of understanding its role in KVM at large.

This is essentially the same mess that eventually led the cpu_role vs. root_role
cleanup, and I think we should figure out a way to give paging_tmpl.h similar
treatment.  E.g. split paging_tmpl.h itself in some way.

Unfortunately, this is a sticking point for me.  If the code movement were minor
and/or cleaner in nature (definitely not your fault, simply the reality of the
code base), I might feel differently.  But as it stands, there is a lot of churn
to get to an endpoint that has significant flaws.

So while I love the idea of separating the MMU implementations from the common
MMU logic, because the guest walker stuff is a lynchpin of sorts, e.g. splitting
out the guest walker logic could go hand-in-hand with reworking guest_mmu, I don't
want to take this series as is.

Sadly, as much as I'm itching to dive in and do a bit of exploration, I am woefully
short on bandwidth right now, so all I can do is say no.  Sorry :-(

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36()
  2023-02-02 18:27 ` [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36() Ben Gardon
@ 2023-03-20 18:51   ` Sean Christopherson
  0 siblings, 0 replies; 29+ messages in thread
From: Sean Christopherson @ 2023-03-20 18:51 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Vipin Sharma, Ricardo Koller

On Thu, Feb 02, 2023, Ben Gardon wrote:
> is_cpuid_PSE36() always returns 1 and is never overridden, so just get
> rid of the function. This saves having to export it in a future commit
> in order to move the include of paging_tmpl.h out of mmu.c.

Probably won't matter as I suspect this series is going to end up a burner way
in the back, but FWIW I'd prefer to preserve is_cpuid_PSE36() in some capacity.
I 100% agree the helper is silly, but the mere existice of the flag is so esoteric
these days that I like having obvious/obnoxious code to call it out.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (19 preceding siblings ...)
  2023-02-02 18:28 ` [PATCH 21/21] KVM: x86/mmu: Split out Shadow MMU lockless walk begin/end Ben Gardon
@ 2023-03-20 19:09 ` Sean Christopherson
  2023-03-23 23:02 ` Sean Christopherson
  21 siblings, 0 replies; 29+ messages in thread
From: Sean Christopherson @ 2023-03-20 19:09 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Vipin Sharma, Ricardo Koller

On Thu, Feb 02, 2023, Ben Gardon wrote:
> Patches 4-6 prepare for the refactor by adding files and exporting
> functions.

For future reference, please do not conflate "export" with "make globally visible"
(here and in many of the changelogs).  The distinction matters, especially for
modules, as an exported symbol is quite different than a globally visible symbol.

We (sadly) lose sight of this in KVM far too often due kvm.ko exporting an asburd
number of symbols for kvm-{amd,intel}.ko, and as a result we've ended up with
non-KVM code using helpers that realy should be KVM-only.  This is something I
hope to remedy in the near-ish future, and so I want us to start getting the
terminology right.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
  2023-03-20 18:41   ` Sean Christopherson
@ 2023-03-21 18:43     ` Ben Gardon
  0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-03-21 18:43 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
	Vipin Sharma, Ricardo Koller

On Mon, Mar 20, 2023 at 11:41 AM Sean Christopherson <seanjc@google.com> wrote:
>
> First off, I apologize for not giving this feedback in the RFC.  I didn't think
> too hard about the impliciations of moving paging_tmpl.h until I actually looked
> at the code.
>
> On Thu, Feb 02, 2023, Ben Gardon wrote:
> > Move the integration point for paging_tmpl.h to shadow_mmu.c since
> > paging_tmpl.h is ostensibly part of the Shadow MMU.
>
> Ostensibly indeed.  While a simple majority of paging_tmpl.h is indeed unique to
> the shadow MMU, all of the guest walker code needs to exist independent of the
> shadow MMU.  And that code is signficant both in terms of lines of code, and
> more importantly in terms of understanding its role in KVM at large.
>
> This is essentially the same mess that eventually led the cpu_role vs. root_role
> cleanup, and I think we should figure out a way to give paging_tmpl.h similar
> treatment.  E.g. split paging_tmpl.h itself in some way.
>
> Unfortunately, this is a sticking point for me.  If the code movement were minor
> and/or cleaner in nature (definitely not your fault, simply the reality of the
> code base), I might feel differently.  But as it stands, there is a lot of churn
> to get to an endpoint that has significant flaws.
>
> So while I love the idea of separating the MMU implementations from the common
> MMU logic, because the guest walker stuff is a lynchpin of sorts, e.g. splitting
> out the guest walker logic could go hand-in-hand with reworking guest_mmu, I don't
> want to take this series as is.
>
> Sadly, as much as I'm itching to dive in and do a bit of exploration, I am woefully
> short on bandwidth right now, so all I can do is say no.  Sorry :-(

Fair enough, thanks for taking a look. I'm not going to have bandwidth
in the foreseeable future to work on this any more either,
unfortunately. I'd love it is someone picked up this series and did
the paging_tmpl.h split, but that's  going to be a lot of work, so in
the meantime, I don't mind just letting this die.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU
  2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
                   ` (20 preceding siblings ...)
  2023-03-20 19:09 ` [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Sean Christopherson
@ 2023-03-23 23:02 ` Sean Christopherson
  21 siblings, 0 replies; 29+ messages in thread
From: Sean Christopherson @ 2023-03-23 23:02 UTC (permalink / raw)
  To: Sean Christopherson, linux-kernel, kvm, Ben Gardon
  Cc: Paolo Bonzini, Peter Xu, David Matlack, Vipin Sharma, Ricardo Koller

On Thu, 02 Feb 2023 18:27:48 +0000, Ben Gardon wrote:
> This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
> implemented in separate files, with a defined interface to common code.
> 
> When the TDP (Two Dimensional Paging) MMU was added to x86 KVM, it came in
> a separate file with a (reasonably) clear interface. This lead to many
> points in the KVM MMU like this:
> 
> [...]

Applied the first three to kvm-x86 mmu, which just makes me look like a jerk
since they're all my patches.  :-(

[01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code
        https://github.com/kvm-x86/linux/commit/727ae3770132
[02/21] KVM: x86/mmu: Replace comment with an actual lockdep assertion on mmu_lock
        https://github.com/kvm-x86/linux/commit/eddd9e8302de
[03/21] KVM: x86/mmu: Clean up mmu.c functions that put return type on separate line
        https://github.com/kvm-x86/linux/commit/f3d90f901d18

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-03-23 23:03 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 18:27 [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
2023-02-02 18:27 ` [PATCH 01/21] KVM: x86/mmu: Rename slot rmap walkers to add clarity and clean up code Ben Gardon
2023-02-02 18:27 ` [PATCH 02/21] KVM: x86/mmu: Replace comment with an actual lockdep assertion on mmu_lock Ben Gardon
2023-02-02 18:27 ` [PATCH 03/21] KVM: x86/mmu: Clean up mmu.c functions that put return type on separate line Ben Gardon
2023-02-02 18:27 ` [PATCH 04/21] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
2023-02-02 18:27 ` [PATCH 05/21] KVM: x86/MMU: Expose functions for the Shadow MMU Ben Gardon
2023-02-02 18:27 ` [PATCH 06/21] KVM: x86/mmu: Get rid of is_cpuid_PSE36() Ben Gardon
2023-03-20 18:51   ` Sean Christopherson
2023-02-02 18:27 ` [PATCH 08/21] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
2023-02-02 18:27 ` [PATCH 09/21] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
2023-03-20 18:41   ` Sean Christopherson
2023-03-21 18:43     ` Ben Gardon
2023-02-02 18:27 ` [PATCH 10/21] KVM: x86/MMU: Clean up Shadow MMU exports Ben Gardon
2023-02-02 18:27 ` [PATCH 11/21] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
2023-02-04 13:52   ` kernel test robot
2023-02-04 19:10   ` kernel test robot
2023-02-02 18:28 ` [PATCH 12/21] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions Ben Gardon
2023-02-02 18:28 ` [PATCH 13/21] KVM: x86/MMU: Fix naming on prepare / commit zap page functions Ben Gardon
2023-02-02 18:28 ` [PATCH 14/21] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c Ben Gardon
2023-02-02 18:28 ` [PATCH 15/21] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
2023-02-04 14:44   ` kernel test robot
2023-02-02 18:28 ` [PATCH 16/21] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
2023-02-02 18:28 ` [PATCH 17/21] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h Ben Gardon
2023-02-02 18:28 ` [PATCH 18/21] KVM: x86/mmu: Move split cache topup functions to shadow_mmu.c Ben Gardon
2023-02-02 18:28 ` [PATCH 19/21] KVM: x86/mmu: Move Shadow MMU part of kvm_mmu_zap_all() to shadow_mmu.h Ben Gardon
2023-02-02 18:28 ` [PATCH 20/21] KVM: x86/mmu: Move Shadow MMU init/teardown to shadow_mmu.c Ben Gardon
2023-02-02 18:28 ` [PATCH 21/21] KVM: x86/mmu: Split out Shadow MMU lockless walk begin/end Ben Gardon
2023-03-20 19:09 ` [PATCH 00/21] KVM: x86/MMU: Formalize the Shadow MMU Sean Christopherson
2023-03-23 23:02 ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).