linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-09-07 17:43   ` David Matlack
  2022-09-18 13:11   ` Robert Hoo
  2022-08-24  9:29 ` [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp() Hou Wenlong
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

The spte pointing to the children SP is dropped, so the
whole gfn range covered by the children SP should be flushed.
Although, Hyper-V may treat a 1-page flush the same if the
address points to a huge page, it still would be better
to use the correct size of huge page. Also introduce
a helper function to do range-based flushing when a direct
SP is dropped, which would help prevent future buggy use
of kvm_flush_remote_tlbs_with_address() in such case.

Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e418ef3ecfcb..a3578abd8bbc 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 	kvm_flush_remote_tlbs_with_range(kvm, &range);
 }
 
+/* Flush all memory mapped by the given direct SP. */
+static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+	WARN_ON_ONCE(!sp->role.direct);
+	kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
+					   KVM_PAGES_PER_HPAGE(sp->role.level + 1));
+}
+
 static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
 			   unsigned int access)
 {
@@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 			return;
 
 		drop_parent_pte(child, sptep);
-		kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
+		kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp()
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
  2022-08-24  9:29 ` [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte() Hou Wenlong
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-09-07 17:50   ` David Matlack
  2022-08-24  9:29 ` [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level() Hou Wenlong
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

When the spte of hupe page is dropped in kvm_set_pte_rmapp(),
the whole gfn range covered by the spte should be flushed.
However, rmap_walk_init_level() doesn't align down the gfn
for new level like tdp iterator does, then the gfn used in
kvm_set_pte_rmapp() is not the base gfn of huge page. And
the size of gfn range is wrong too for huge page. Use the
base gfn of huge page and the size of huge page for
flushing tlbs for huge page.

Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a3578abd8bbc..3bcff56df109 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1438,7 +1438,8 @@ static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	}
 
 	if (need_flush && kvm_available_flush_tlb_with_range()) {
-		kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
+		kvm_flush_remote_tlbs_with_address(kvm, gfn & -KVM_PAGES_PER_HPAGE(level),
+						   KVM_PAGES_PER_HPAGE(level));
 		return false;
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level()
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
  2022-08-24  9:29 ` [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte() Hou Wenlong
  2022-08-24  9:29 ` [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp() Hou Wenlong
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-09-07 17:58   ` David Matlack
  2022-08-24  9:29 ` [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range Hou Wenlong
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

Since the children SP is zapped, the gfn range of tlb flushing should be
the range covered by children SP not parent SP. Replace sp->gfn which is
the base gfn of parent SP with iter->gfn and use the correct size of
gfn range for children SP to reduce tlb flushing range.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index bf2ccf9debca..08b7932122ec 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1071,8 +1071,8 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
 		return RET_PF_RETRY;
 	else if (is_shadow_present_pte(iter->old_spte) &&
 		 !is_last_spte(iter->old_spte, iter->level))
-		kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn,
-						   KVM_PAGES_PER_HPAGE(iter->level + 1));
+		kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter->gfn,
+						   KVM_PAGES_PER_HPAGE(iter->level));
 
 	/*
 	 * If the page fault was caused by a write but the page is write
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
                   ` (2 preceding siblings ...)
  2022-08-24  9:29 ` [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level() Hou Wenlong
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-09-07 18:25   ` David Matlack
  2022-08-24  9:29 ` [PATCH v2 5/6] KVM: x86/mmu: Introduce helper function to do range-based flushing for given page Hou Wenlong
  2022-08-24  9:29 ` [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)() Hou Wenlong
  5 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

When a spte is dropped, the start gfn of tlb flushing should
be the gfn of spte not the base gfn of SP which contains the
spte. Also introduce a helper function to do range-based
flushing when a spte is dropped, which would help prevent
future buggy use of kvm_flush_remote_tlbs_with_address() in
such case.

Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/mmu.c         | 20 +++++++++++++++-----
 arch/x86/kvm/mmu/paging_tmpl.h |  3 +--
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3bcff56df109..e0b9432b9491 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -260,6 +260,18 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 	kvm_flush_remote_tlbs_with_range(kvm, &range);
 }
 
+static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index);
+
+/* Flush the range of guest memory mapped by the given SPTE. */
+static void kvm_flush_remote_tlbs_sptep(struct kvm *kvm, u64 *sptep)
+{
+	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
+	gfn_t gfn = kvm_mmu_page_get_gfn(sp, spte_index(sptep));
+
+	kvm_flush_remote_tlbs_with_address(kvm, gfn,
+					   KVM_PAGES_PER_HPAGE(sp->role.level));
+}
+
 /* Flush all memory mapped by the given direct SP. */
 static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
@@ -1156,8 +1168,7 @@ static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
 	drop_spte(kvm, sptep);
 
 	if (flush)
-		kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
-			KVM_PAGES_PER_HPAGE(sp->role.level));
+		kvm_flush_remote_tlbs_sptep(kvm, sptep);
 }
 
 /*
@@ -1608,7 +1619,7 @@ static void __rmap_add(struct kvm *kvm,
 	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
 		kvm_zap_all_rmap_sptes(kvm, rmap_head);
 		kvm_flush_remote_tlbs_with_address(
-				kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
+				kvm, gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
 	}
 }
 
@@ -6402,8 +6413,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 			kvm_zap_one_rmap_spte(kvm, rmap_head, sptep);
 
 			if (kvm_available_flush_tlb_with_range())
-				kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
-					KVM_PAGES_PER_HPAGE(sp->role.level));
+				kvm_flush_remote_tlbs_sptep(kvm, sptep);
 			else
 				need_tlb_flush = 1;
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 39e0205e7300..04149c704d5b 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -937,8 +937,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
 
 			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
 			if (is_shadow_present_pte(old_spte))
-				kvm_flush_remote_tlbs_with_address(vcpu->kvm,
-					sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
+				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
 
 			if (!rmap_can_add(vcpu))
 				break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/6] KVM: x86/mmu: Introduce helper function to do range-based flushing for given page
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
                   ` (3 preceding siblings ...)
  2022-08-24  9:29 ` [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range Hou Wenlong
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-08-24  9:29 ` [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)() Hou Wenlong
  5 siblings, 0 replies; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

Flushing tlb for one page (huge or not) is the main use case, so
introduce a helper function for this common operation to make
the code clear.

Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/mmu.c          | 16 ++++++----------
 arch/x86/kvm/mmu/mmu_internal.h | 10 ++++++++++
 arch/x86/kvm/mmu/tdp_mmu.c      |  6 ++----
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e0b9432b9491..92ca76e11d96 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -268,16 +268,14 @@ static void kvm_flush_remote_tlbs_sptep(struct kvm *kvm, u64 *sptep)
 	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
 	gfn_t gfn = kvm_mmu_page_get_gfn(sp, spte_index(sptep));

-	kvm_flush_remote_tlbs_with_address(kvm, gfn,
-					   KVM_PAGES_PER_HPAGE(sp->role.level));
+	kvm_flush_remote_tlbs_gfn(kvm, gfn, sp->role.level);
 }

 /* Flush all memory mapped by the given direct SP. */
 static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	WARN_ON_ONCE(!sp->role.direct);
-	kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
-					   KVM_PAGES_PER_HPAGE(sp->role.level + 1));
+	kvm_flush_remote_tlbs_gfn(kvm, sp->gfn, sp->role.level + 1);
 }

 static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
@@ -1449,8 +1447,8 @@ static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	}

 	if (need_flush && kvm_available_flush_tlb_with_range()) {
-		kvm_flush_remote_tlbs_with_address(kvm, gfn & -KVM_PAGES_PER_HPAGE(level),
-						   KVM_PAGES_PER_HPAGE(level));
+		kvm_flush_remote_tlbs_gfn(kvm, gfn & -KVM_PAGES_PER_HPAGE(level),
+					  level);
 		return false;
 	}

@@ -1618,8 +1616,7 @@ static void __rmap_add(struct kvm *kvm,

 	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
 		kvm_zap_all_rmap_sptes(kvm, rmap_head);
-		kvm_flush_remote_tlbs_with_address(
-				kvm, gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
+		kvm_flush_remote_tlbs_gfn(kvm, gfn, sp->role.level);
 	}
 }

@@ -2844,8 +2841,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
 	}

 	if (flush)
-		kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn,
-				KVM_PAGES_PER_HPAGE(level));
+		kvm_flush_remote_tlbs_gfn(vcpu->kvm, gfn, level);

 	pgprintk("%s: setting spte %llx\n", __func__, *sptep);

diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 582def531d4d..6651c154f2e0 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -163,8 +163,18 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 				    struct kvm_memory_slot *slot, u64 gfn,
 				    int min_level);
+
 void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 					u64 start_gfn, u64 pages);
+
+/* Flush the given page (huge or not) of guest memory. */
+static inline void kvm_flush_remote_tlbs_gfn(struct kvm *kvm, gfn_t gfn, int level)
+{
+	u64 pages = KVM_PAGES_PER_HPAGE(level);
+
+	kvm_flush_remote_tlbs_with_address(kvm, gfn, pages);
+}
+
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);

 extern int nx_huge_pages;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 08b7932122ec..567691440ab0 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -673,8 +673,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
 	if (ret)
 		return ret;

-	kvm_flush_remote_tlbs_with_address(kvm, iter->gfn,
-					   KVM_PAGES_PER_HPAGE(iter->level));
+	kvm_flush_remote_tlbs_gfn(kvm, iter->gfn, iter->level);

 	/*
 	 * No other thread can overwrite the removed SPTE as they must either
@@ -1071,8 +1070,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
 		return RET_PF_RETRY;
 	else if (is_shadow_present_pte(iter->old_spte) &&
 		 !is_last_spte(iter->old_spte, iter->level))
-		kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter->gfn,
-						   KVM_PAGES_PER_HPAGE(iter->level));
+		kvm_flush_remote_tlbs_gfn(vcpu->kvm, iter->gfn, iter->level);

 	/*
 	 * If the page fault was caused by a write but the page is write
--
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)()
       [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
                   ` (4 preceding siblings ...)
  2022-08-24  9:29 ` [PATCH v2 5/6] KVM: x86/mmu: Introduce helper function to do range-based flushing for given page Hou Wenlong
@ 2022-08-24  9:29 ` Hou Wenlong
  2022-09-07 17:40   ` David Matlack
  5 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-08-24  9:29 UTC (permalink / raw)
  To: kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

Only SP with PG_LEVLE_4K level could be unsync, so the size of gfn range
must be 1.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 04149c704d5b..486a3163b1e4 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -937,7 +937,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
 
 			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
 			if (is_shadow_present_pte(old_spte))
-				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
+				kvm_flush_remote_tlbs_gfn(vcpu->kvm,
+					kvm_mmu_page_get_gfn(sp, sptep - sp->spt), 1);
 
 			if (!rmap_can_add(vcpu))
 				break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)()
  2022-08-24  9:29 ` [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)() Hou Wenlong
@ 2022-09-07 17:40   ` David Matlack
  2022-09-13 12:58     ` Hou Wenlong
  0 siblings, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-07 17:40 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel

On Wed, Aug 24, 2022 at 05:29:23PM +0800, Hou Wenlong wrote:
> Only SP with PG_LEVLE_4K level could be unsync, so the size of gfn range
> must be 1.
> 
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 04149c704d5b..486a3163b1e4 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -937,7 +937,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
>  
>  			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
>  			if (is_shadow_present_pte(old_spte))
> -				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
> +				kvm_flush_remote_tlbs_gfn(vcpu->kvm,
> +					kvm_mmu_page_get_gfn(sp, sptep - sp->spt), 1);

The third argument to kvm_flush_remote_tlbs_gfn() is the level, not the
number of pages. But that aside, I don't understand why this patch is
necessary. kvm_flush_remote_tlbs_sptep() should already do the right
thing.

>  
>  			if (!rmap_can_add(vcpu))
>  				break;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-08-24  9:29 ` [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte() Hou Wenlong
@ 2022-09-07 17:43   ` David Matlack
  2022-09-13 12:07     ` Hou Wenlong
  2022-09-18 13:11   ` Robert Hoo
  1 sibling, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-07 17:43 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Lan Tianyu, linux-kernel

On Wed, Aug 24, 2022 at 05:29:18PM +0800, Hou Wenlong wrote:
> The spte pointing to the children SP is dropped, so the
> whole gfn range covered by the children SP should be flushed.
> Although, Hyper-V may treat a 1-page flush the same if the
> address points to a huge page, it still would be better
> to use the correct size of huge page. Also introduce
> a helper function to do range-based flushing when a direct
> SP is dropped, which would help prevent future buggy use
> of kvm_flush_remote_tlbs_with_address() in such case.
> 
> Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> Suggested-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e418ef3ecfcb..a3578abd8bbc 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
>  	kvm_flush_remote_tlbs_with_range(kvm, &range);
>  }
>  
> +/* Flush all memory mapped by the given direct SP. */
> +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> +{
> +	WARN_ON_ONCE(!sp->role.direct);
> +	kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> +					   KVM_PAGES_PER_HPAGE(sp->role.level + 1));

nit: I think it would make sense to introduce
kvm_flush_remote_tlbs_gfn() in this patch since you are going to
eventually use it here anyway.

> +}
> +
>  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
>  			   unsigned int access)
>  {
> @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  			return;
>  
>  		drop_parent_pte(child, sptep);
> -		kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
> +		kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
>  	}
>  }
>  
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp()
  2022-08-24  9:29 ` [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp() Hou Wenlong
@ 2022-09-07 17:50   ` David Matlack
  0 siblings, 0 replies; 24+ messages in thread
From: David Matlack @ 2022-09-07 17:50 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Lan Tianyu, linux-kernel

On Wed, Aug 24, 2022 at 05:29:19PM +0800, Hou Wenlong wrote:
> When the spte of hupe page is dropped in kvm_set_pte_rmapp(),
> the whole gfn range covered by the spte should be flushed.
> However, rmap_walk_init_level() doesn't align down the gfn
> for new level like tdp iterator does, then the gfn used in
> kvm_set_pte_rmapp() is not the base gfn of huge page. And
> the size of gfn range is wrong too for huge page. Use the
> base gfn of huge page and the size of huge page for
> flushing tlbs for huge page.
> 
> Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index a3578abd8bbc..3bcff56df109 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1438,7 +1438,8 @@ static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
>  	}
>  
>  	if (need_flush && kvm_available_flush_tlb_with_range()) {
> -		kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
> +		kvm_flush_remote_tlbs_with_address(kvm, gfn & -KVM_PAGES_PER_HPAGE(level),

Rounding down the GFN to a huge page size is a common pattern throughout
KVM. Can you introduce a common way of doing this and clean up the other
call sites?

> +						   KVM_PAGES_PER_HPAGE(level));

This eventually gets converted to kvm_flush_remote_tlbs_gfn() in a later
patch; which is even more reason to introduce
kvm_flush_remote_tlbs_gfn() in the previous patch.

>  		return false;
>  	}
>  
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level()
  2022-08-24  9:29 ` [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level() Hou Wenlong
@ 2022-09-07 17:58   ` David Matlack
  0 siblings, 0 replies; 24+ messages in thread
From: David Matlack @ 2022-09-07 17:58 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel

On Wed, Aug 24, 2022 at 05:29:20PM +0800, Hou Wenlong wrote:
> Since the children SP is zapped, the gfn range of tlb flushing should be
> the range covered by children SP not parent SP. Replace sp->gfn which is
> the base gfn of parent SP with iter->gfn and use the correct size of
> gfn range for children SP to reduce tlb flushing range.
> 

Fixes: bb95dfb9e2df ("KVM: x86/mmu: Defer TLB flush to caller when freeing TDP MMU shadow pages")

> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index bf2ccf9debca..08b7932122ec 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1071,8 +1071,8 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
>  		return RET_PF_RETRY;
>  	else if (is_shadow_present_pte(iter->old_spte) &&
>  		 !is_last_spte(iter->old_spte, iter->level))
> -		kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn,
> -						   KVM_PAGES_PER_HPAGE(iter->level + 1));
> +		kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter->gfn,
> +						   KVM_PAGES_PER_HPAGE(iter->level));
>  
>  	/*
>  	 * If the page fault was caused by a write but the page is write
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range
  2022-08-24  9:29 ` [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range Hou Wenlong
@ 2022-09-07 18:25   ` David Matlack
  2022-09-13 12:50     ` Hou Wenlong
  0 siblings, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-07 18:25 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Lan Tianyu, linux-kernel

On Wed, Aug 24, 2022 at 05:29:21PM +0800, Hou Wenlong wrote:
> When a spte is dropped, the start gfn of tlb flushing should
> be the gfn of spte not the base gfn of SP which contains the
> spte. Also introduce a helper function to do range-based
> flushing when a spte is dropped, which would help prevent
> future buggy use of kvm_flush_remote_tlbs_with_address() in
> such case.
> 
> Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> Suggested-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c         | 20 +++++++++++++++-----
>  arch/x86/kvm/mmu/paging_tmpl.h |  3 +--
>  2 files changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 3bcff56df109..e0b9432b9491 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -260,6 +260,18 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
>  	kvm_flush_remote_tlbs_with_range(kvm, &range);
>  }
>  
> +static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index);
> +
> +/* Flush the range of guest memory mapped by the given SPTE. */
> +static void kvm_flush_remote_tlbs_sptep(struct kvm *kvm, u64 *sptep)
> +{
> +	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
> +	gfn_t gfn = kvm_mmu_page_get_gfn(sp, spte_index(sptep));
> +
> +	kvm_flush_remote_tlbs_with_address(kvm, gfn,
> +					   KVM_PAGES_PER_HPAGE(sp->role.level));

How is the range-based TLB flushing supposed to work with indirect MMUs?
When KVM is using shadow paging, the gfn here is not part of the actual
translation.

For example, when TDP is disabled, KVM's shadow page tables translate
GVA to HPA. When Nested Virtualization is in use and running L2, KVM's
shadow page tables translate nGPA to HPA.

Ah, I see x86_ops.tlb_remote_flush_with_range is only set when running
on Hyper-V and TDP is enabled (VMX checks enable_ept and SVM checks
npt_enabled). But it looks like the nested case might still be broken?

> +}
> +
>  /* Flush all memory mapped by the given direct SP. */
>  static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
>  {
> @@ -1156,8 +1168,7 @@ static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
>  	drop_spte(kvm, sptep);
>  
>  	if (flush)
> -		kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> -			KVM_PAGES_PER_HPAGE(sp->role.level));
> +		kvm_flush_remote_tlbs_sptep(kvm, sptep);
>  }
>  
>  /*
> @@ -1608,7 +1619,7 @@ static void __rmap_add(struct kvm *kvm,
>  	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
>  		kvm_zap_all_rmap_sptes(kvm, rmap_head);
>  		kvm_flush_remote_tlbs_with_address(
> -				kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
> +				kvm, gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
>  	}
>  }
>  
> @@ -6402,8 +6413,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
>  			kvm_zap_one_rmap_spte(kvm, rmap_head, sptep);
>  
>  			if (kvm_available_flush_tlb_with_range())
> -				kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> -					KVM_PAGES_PER_HPAGE(sp->role.level));
> +				kvm_flush_remote_tlbs_sptep(kvm, sptep);
>  			else
>  				need_tlb_flush = 1;
>  
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 39e0205e7300..04149c704d5b 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -937,8 +937,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
>  
>  			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
>  			if (is_shadow_present_pte(old_spte))
> -				kvm_flush_remote_tlbs_with_address(vcpu->kvm,
> -					sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
> +				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
>  
>  			if (!rmap_can_add(vcpu))
>  				break;
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-07 17:43   ` David Matlack
@ 2022-09-13 12:07     ` Hou Wenlong
  2022-09-15 11:47       ` Liam Ni
  0 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-09-13 12:07 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Lan Tianyu, linux-kernel

On Thu, Sep 08, 2022 at 01:43:54AM +0800, David Matlack wrote:
> On Wed, Aug 24, 2022 at 05:29:18PM +0800, Hou Wenlong wrote:
> > The spte pointing to the children SP is dropped, so the
> > whole gfn range covered by the children SP should be flushed.
> > Although, Hyper-V may treat a 1-page flush the same if the
> > address points to a huge page, it still would be better
> > to use the correct size of huge page. Also introduce
> > a helper function to do range-based flushing when a direct
> > SP is dropped, which would help prevent future buggy use
> > of kvm_flush_remote_tlbs_with_address() in such case.
> > 
> > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > Suggested-by: David Matlack <dmatlack@google.com>
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e418ef3ecfcb..a3578abd8bbc 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> >  	kvm_flush_remote_tlbs_with_range(kvm, &range);
> >  }
> >  
> > +/* Flush all memory mapped by the given direct SP. */
> > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> > +{
> > +	WARN_ON_ONCE(!sp->role.direct);
> > +	kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > +					   KVM_PAGES_PER_HPAGE(sp->role.level + 1));
> 
> nit: I think it would make sense to introduce
> kvm_flush_remote_tlbs_gfn() in this patch since you are going to
> eventually use it here anyway.
>
OK, I'll do it in the next version. Thanks!
 
> > +}
> > +
> >  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
> >  			   unsigned int access)
> >  {
> > @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> >  			return;
> >  
> >  		drop_parent_pte(child, sptep);
> > -		kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
> > +		kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
> >  	}
> >  }
> >  
> > -- 
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range
  2022-09-07 18:25   ` David Matlack
@ 2022-09-13 12:50     ` Hou Wenlong
  0 siblings, 0 replies; 24+ messages in thread
From: Hou Wenlong @ 2022-09-13 12:50 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Lan Tianyu, linux-kernel

On Thu, Sep 08, 2022 at 02:25:42AM +0800, David Matlack wrote:
> On Wed, Aug 24, 2022 at 05:29:21PM +0800, Hou Wenlong wrote:
> > When a spte is dropped, the start gfn of tlb flushing should
> > be the gfn of spte not the base gfn of SP which contains the
> > spte. Also introduce a helper function to do range-based
> > flushing when a spte is dropped, which would help prevent
> > future buggy use of kvm_flush_remote_tlbs_with_address() in
> > such case.
> > 
> > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > Suggested-by: David Matlack <dmatlack@google.com>
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c         | 20 +++++++++++++++-----
> >  arch/x86/kvm/mmu/paging_tmpl.h |  3 +--
> >  2 files changed, 16 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 3bcff56df109..e0b9432b9491 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -260,6 +260,18 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> >  	kvm_flush_remote_tlbs_with_range(kvm, &range);
> >  }
> >  
> > +static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index);
> > +
> > +/* Flush the range of guest memory mapped by the given SPTE. */
> > +static void kvm_flush_remote_tlbs_sptep(struct kvm *kvm, u64 *sptep)
> > +{
> > +	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
> > +	gfn_t gfn = kvm_mmu_page_get_gfn(sp, spte_index(sptep));
> > +
> > +	kvm_flush_remote_tlbs_with_address(kvm, gfn,
> > +					   KVM_PAGES_PER_HPAGE(sp->role.level));
> 
> How is the range-based TLB flushing supposed to work with indirect MMUs?
> When KVM is using shadow paging, the gfn here is not part of the actual
> translation.
> 
> For example, when TDP is disabled, KVM's shadow page tables translate
> GVA to HPA. When Nested Virtualization is in use and running L2, KVM's
> shadow page tables translate nGPA to HPA.
> 
> Ah, I see x86_ops.tlb_remote_flush_with_range is only set when running
> on Hyper-V and TDP is enabled (VMX checks enable_ept and SVM checks
> npt_enabled). But it looks like the nested case might still be broken?
>
Yeah, range based tlb flushing is only used on Hyper-V as Paolo said.
Actually, I don't know how Hyper-V implements the range based tlb
flushing hypercall, since KVM on Hyper-V is already nested, so gfn here
is already nGPA. It seems that the current used TDP root is passed in
hypercall, so maybe Hyper-V could do nGPA to HPA translation by looking
up TDP page table. Then for nested case, it chould work well?

> > +}
> > +
> >  /* Flush all memory mapped by the given direct SP. */
> >  static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> >  {
> > @@ -1156,8 +1168,7 @@ static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
> >  	drop_spte(kvm, sptep);
> >  
> >  	if (flush)
> > -		kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > -			KVM_PAGES_PER_HPAGE(sp->role.level));
> > +		kvm_flush_remote_tlbs_sptep(kvm, sptep);
> >  }
> >  
> >  /*
> > @@ -1608,7 +1619,7 @@ static void __rmap_add(struct kvm *kvm,
> >  	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
> >  		kvm_zap_all_rmap_sptes(kvm, rmap_head);
> >  		kvm_flush_remote_tlbs_with_address(
> > -				kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
> > +				kvm, gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
> >  	}
> >  }
> >  
> > @@ -6402,8 +6413,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
> >  			kvm_zap_one_rmap_spte(kvm, rmap_head, sptep);
> >  
> >  			if (kvm_available_flush_tlb_with_range())
> > -				kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > -					KVM_PAGES_PER_HPAGE(sp->role.level));
> > +				kvm_flush_remote_tlbs_sptep(kvm, sptep);
> >  			else
> >  				need_tlb_flush = 1;
> >  
> > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > index 39e0205e7300..04149c704d5b 100644
> > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > @@ -937,8 +937,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
> >  
> >  			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
> >  			if (is_shadow_present_pte(old_spte))
> > -				kvm_flush_remote_tlbs_with_address(vcpu->kvm,
> > -					sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
> > +				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
> >  
> >  			if (!rmap_can_add(vcpu))
> >  				break;
> > -- 
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)()
  2022-09-07 17:40   ` David Matlack
@ 2022-09-13 12:58     ` Hou Wenlong
  2022-09-13 13:57       ` David Matlack
  0 siblings, 1 reply; 24+ messages in thread
From: Hou Wenlong @ 2022-09-13 12:58 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel

On Thu, Sep 08, 2022 at 01:40:16AM +0800, David Matlack wrote:
> On Wed, Aug 24, 2022 at 05:29:23PM +0800, Hou Wenlong wrote:
> > Only SP with PG_LEVLE_4K level could be unsync, so the size of gfn range
> > must be 1.
> > 
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > index 04149c704d5b..486a3163b1e4 100644
> > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > @@ -937,7 +937,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
> >  
> >  			mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
> >  			if (is_shadow_present_pte(old_spte))
> > -				kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
> > +				kvm_flush_remote_tlbs_gfn(vcpu->kvm,
> > +					kvm_mmu_page_get_gfn(sp, sptep - sp->spt), 1);
> 
> The third argument to kvm_flush_remote_tlbs_gfn() is the level, not the
> number of pages. But that aside, I don't understand why this patch is
> necessary. kvm_flush_remote_tlbs_sptep() should already do the right
> thing.
>
Since only SP with PG_LEVEL_4K level could be unsync, so the level must
be PG_LEVEL_4K, then sp->role.level access could be dropped. However,
I'm not sure whether it is useful. I can drop it if it is useless.

> >  
> >  			if (!rmap_can_add(vcpu))
> >  				break;
> > -- 
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)()
  2022-09-13 12:58     ` Hou Wenlong
@ 2022-09-13 13:57       ` David Matlack
  2022-09-16 19:33         ` Sean Christopherson
  0 siblings, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-13 13:57 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: kvm list, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML,
	H. Peter Anvin, LKML

On Tue, Sep 13, 2022 at 5:58 AM Hou Wenlong <houwenlong.hwl@antgroup.com> wrote:
>
> On Thu, Sep 08, 2022 at 01:40:16AM +0800, David Matlack wrote:
> > On Wed, Aug 24, 2022 at 05:29:23PM +0800, Hou Wenlong wrote:
> > > Only SP with PG_LEVLE_4K level could be unsync, so the size of gfn range
> > > must be 1.
> > >
> > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > ---
> > >  arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > > index 04149c704d5b..486a3163b1e4 100644
> > > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > > @@ -937,7 +937,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
> > >
> > >                     mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
> > >                     if (is_shadow_present_pte(old_spte))
> > > -                           kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
> > > +                           kvm_flush_remote_tlbs_gfn(vcpu->kvm,
> > > +                                   kvm_mmu_page_get_gfn(sp, sptep - sp->spt), 1);
> >
> > The third argument to kvm_flush_remote_tlbs_gfn() is the level, not the
> > number of pages. But that aside, I don't understand why this patch is
> > necessary. kvm_flush_remote_tlbs_sptep() should already do the right
> > thing.
> >
> Since only SP with PG_LEVEL_4K level could be unsync, so the level must
> be PG_LEVEL_4K, then sp->role.level access could be dropped. However,
> I'm not sure whether it is useful. I can drop it if it is useless.

Ah, I see. I would be surprised if avoiding the read of sp->role.level
has any noticeable impact on VM performance so I vote to drop this patch.


>
> > >
> > >                     if (!rmap_can_add(vcpu))
> > >                             break;
> > > --
> > > 2.31.1
> > >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-13 12:07     ` Hou Wenlong
@ 2022-09-15 11:47       ` Liam Ni
  2022-09-16  2:49         ` Hou Wenlong
  2022-09-20 18:08         ` David Matlack
  0 siblings, 2 replies; 24+ messages in thread
From: Liam Ni @ 2022-09-15 11:47 UTC (permalink / raw)
  To: Hou Wenlong
  Cc: David Matlack, kvm, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

On Tue, 13 Sept 2022 at 20:13, Hou Wenlong <houwenlong.hwl@antgroup.com> wrote:
>
> On Thu, Sep 08, 2022 at 01:43:54AM +0800, David Matlack wrote:
> > On Wed, Aug 24, 2022 at 05:29:18PM +0800, Hou Wenlong wrote:
> > > The spte pointing to the children SP is dropped, so the
> > > whole gfn range covered by the children SP should be flushed.
> > > Although, Hyper-V may treat a 1-page flush the same if the
> > > address points to a huge page, it still would be better
> > > to use the correct size of huge page. Also introduce
> > > a helper function to do range-based flushing when a direct
> > > SP is dropped, which would help prevent future buggy use
> > > of kvm_flush_remote_tlbs_with_address() in such case.
> > >
> > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > > Suggested-by: David Matlack <dmatlack@google.com>
> > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> > >     kvm_flush_remote_tlbs_with_range(kvm, &range);
> > >  }
> > >
> > > +/* Flush all memory mapped by the given direct SP. */
> > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> > > +{
> > > +   WARN_ON_ONCE(!sp->role.direct);
> > > +   kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > > +                                      KVM_PAGES_PER_HPAGE(sp->role.level + 1));

Do we need "+1" here?    sp->role.level=1 means 4k page.
I think here should be   “KVM_PAGES_PER_HPAGE(sp->role.level)”

> >
> > nit: I think it would make sense to introduce
> > kvm_flush_remote_tlbs_gfn() in this patch since you are going to
> > eventually use it here anyway.
> >
> OK, I'll do it in the next version. Thanks!
>
> > > +}
> > > +
> > >  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
> > >                        unsigned int access)
> > >  {
> > > @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> > >                     return;
> > >
> > >             drop_parent_pte(child, sptep);
> > > -           kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
> > > +           kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
> > >     }
> > >  }
> > >
> > > --
> > > 2.31.1
> > >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-15 11:47       ` Liam Ni
@ 2022-09-16  2:49         ` Hou Wenlong
  2022-09-20 18:08         ` David Matlack
  1 sibling, 0 replies; 24+ messages in thread
From: Hou Wenlong @ 2022-09-16  2:49 UTC (permalink / raw)
  To: Liam Ni
  Cc: David Matlack, kvm, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

On Thu, Sep 15, 2022 at 07:47:35PM +0800, Liam Ni wrote:
> On Tue, 13 Sept 2022 at 20:13, Hou Wenlong <houwenlong.hwl@antgroup.com> wrote:
> >
> > On Thu, Sep 08, 2022 at 01:43:54AM +0800, David Matlack wrote:
> > > On Wed, Aug 24, 2022 at 05:29:18PM +0800, Hou Wenlong wrote:
> > > > The spte pointing to the children SP is dropped, so the
> > > > whole gfn range covered by the children SP should be flushed.
> > > > Although, Hyper-V may treat a 1-page flush the same if the
> > > > address points to a huge page, it still would be better
> > > > to use the correct size of huge page. Also introduce
> > > > a helper function to do range-based flushing when a direct
> > > > SP is dropped, which would help prevent future buggy use
> > > > of kvm_flush_remote_tlbs_with_address() in such case.
> > > >
> > > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > > > Suggested-by: David Matlack <dmatlack@google.com>
> > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > > ---
> > > >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> > > >     kvm_flush_remote_tlbs_with_range(kvm, &range);
> > > >  }
> > > >
> > > > +/* Flush all memory mapped by the given direct SP. */
> > > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> > > > +{
> > > > +   WARN_ON_ONCE(!sp->role.direct);
> > > > +   kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > > > +                                      KVM_PAGES_PER_HPAGE(sp->role.level + 1));
> 
> Do we need "+1" here?    sp->role.level=1 means 4k page.
> I think here should be   “KVM_PAGES_PER_HPAGE(sp->role.level)”
>
This helper function is used to flush all memory mapped by the given
SP, not one spte in the SP. If sp->role.level=1, the size of memory
mapped by the SP is 2M, but KVM_PAGES_PER_HPAGE(sp->role.level) is
1.

> > >
> > > nit: I think it would make sense to introduce
> > > kvm_flush_remote_tlbs_gfn() in this patch since you are going to
> > > eventually use it here anyway.
> > >
> > OK, I'll do it in the next version. Thanks!
> >
> > > > +}
> > > > +
> > > >  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
> > > >                        unsigned int access)
> > > >  {
> > > > @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> > > >                     return;
> > > >
> > > >             drop_parent_pte(child, sptep);
> > > > -           kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
> > > > +           kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
> > > >     }
> > > >  }
> > > >
> > > > --
> > > > 2.31.1
> > > >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)()
  2022-09-13 13:57       ` David Matlack
@ 2022-09-16 19:33         ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2022-09-16 19:33 UTC (permalink / raw)
  To: David Matlack
  Cc: Hou Wenlong, kvm list, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML,
	H. Peter Anvin, LKML

On Tue, Sep 13, 2022, David Matlack wrote:
> On Tue, Sep 13, 2022 at 5:58 AM Hou Wenlong <houwenlong.hwl@antgroup.com> wrote:
> >
> > On Thu, Sep 08, 2022 at 01:40:16AM +0800, David Matlack wrote:
> > > On Wed, Aug 24, 2022 at 05:29:23PM +0800, Hou Wenlong wrote:
> > > > Only SP with PG_LEVLE_4K level could be unsync, so the size of gfn range
> > > > must be 1.
> > > >
> > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > > ---
> > > >  arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > > > index 04149c704d5b..486a3163b1e4 100644
> > > > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > > > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > > > @@ -937,7 +937,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
> > > >
> > > >                     mmu_page_zap_pte(vcpu->kvm, sp, sptep, NULL);
> > > >                     if (is_shadow_present_pte(old_spte))
> > > > -                           kvm_flush_remote_tlbs_sptep(vcpu->kvm, sptep);
> > > > +                           kvm_flush_remote_tlbs_gfn(vcpu->kvm,
> > > > +                                   kvm_mmu_page_get_gfn(sp, sptep - sp->spt), 1);
> > >
> > > The third argument to kvm_flush_remote_tlbs_gfn() is the level, not the
> > > number of pages. But that aside, I don't understand why this patch is
> > > necessary. kvm_flush_remote_tlbs_sptep() should already do the right
> > > thing.
> > >
> > Since only SP with PG_LEVEL_4K level could be unsync, so the level must
> > be PG_LEVEL_4K, then sp->role.level access could be dropped. However,
> > I'm not sure whether it is useful. I can drop it if it is useless.
> 
> Ah, I see. I would be surprised if avoiding the read of sp->role.level
> has any noticeable impact on VM performance so I vote to drop this patch.

Agreed, the cost of the sp->role.level lookup is negligible in this case, and IMO
using kvm_flush_remote_tlbs_sptep() is more intuitive.

If kvm_flush_remote_tlbs_sptep() didn't exist and this was open coding the use of
kvm_flush_remote_tlbs_with_address() + KVM_PAGES_PER_HPAGE(), then I would be in
favor of hardcoding '1', because at that point the use of KVM_PAGES_PER_HPAGE() is
misleading in its own way.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-08-24  9:29 ` [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte() Hou Wenlong
  2022-09-07 17:43   ` David Matlack
@ 2022-09-18 13:11   ` Robert Hoo
  2022-09-20 18:32     ` David Matlack
  1 sibling, 1 reply; 24+ messages in thread
From: Robert Hoo @ 2022-09-18 13:11 UTC (permalink / raw)
  To: Hou Wenlong, kvm
  Cc: David Matlack, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

On Wed, 2022-08-24 at 17:29 +0800, Hou Wenlong wrote:
> The spte pointing to the children SP is dropped, so the
> whole gfn range covered by the children SP should be flushed.
> Although, Hyper-V may treat a 1-page flush the same if the
> address points to a huge page, it still would be better
> to use the correct size of huge page. Also introduce
> a helper function to do range-based flushing when a direct
> SP is dropped, which would help prevent future buggy use
> of kvm_flush_remote_tlbs_with_address() in such case.
> 
> Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new
> one to flush a specified range.")
> Suggested-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e418ef3ecfcb..a3578abd8bbc 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct
> kvm *kvm,
>  	kvm_flush_remote_tlbs_with_range(kvm, &range);
>  }
>  
> +/* Flush all memory mapped by the given direct SP. */
> +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct
> kvm_mmu_page *sp)
> +{
> +	WARN_ON_ONCE(!sp->role.direct);

What if !sp->role.direct? Below flushing sp->gfn isn't expected? but
still to do it. Is this operation harmless? 

> +	kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> +					   KVM_PAGES_PER_HPAGE(sp-
> >role.level + 1));
> +}
> +
>  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64
> gfn,
>  			   unsigned int access)
>  {
> @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct
> kvm_vcpu *vcpu, u64 *sptep,
>  			return;
>  
>  		drop_parent_pte(child, sptep);
> -		kvm_flush_remote_tlbs_with_address(vcpu->kvm, child-
> >gfn, 1);
> +		kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
>  	}
>  }
>  


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-15 11:47       ` Liam Ni
  2022-09-16  2:49         ` Hou Wenlong
@ 2022-09-20 18:08         ` David Matlack
  1 sibling, 0 replies; 24+ messages in thread
From: David Matlack @ 2022-09-20 18:08 UTC (permalink / raw)
  To: Liam Ni
  Cc: Hou Wenlong, kvm list, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Lan Tianyu, LKML

On Thu, Sep 15, 2022 at 4:47 AM Liam Ni <zhiguangni01@gmail.com> wrote:
>
> On Tue, 13 Sept 2022 at 20:13, Hou Wenlong <houwenlong.hwl@antgroup.com> wrote:
> >
> > On Thu, Sep 08, 2022 at 01:43:54AM +0800, David Matlack wrote:
> > > On Wed, Aug 24, 2022 at 05:29:18PM +0800, Hou Wenlong wrote:
> > > > The spte pointing to the children SP is dropped, so the
> > > > whole gfn range covered by the children SP should be flushed.
> > > > Although, Hyper-V may treat a 1-page flush the same if the
> > > > address points to a huge page, it still would be better
> > > > to use the correct size of huge page. Also introduce
> > > > a helper function to do range-based flushing when a direct
> > > > SP is dropped, which would help prevent future buggy use
> > > > of kvm_flush_remote_tlbs_with_address() in such case.
> > > >
> > > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > > > Suggested-by: David Matlack <dmatlack@google.com>
> > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > > ---
> > > >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> > > >     kvm_flush_remote_tlbs_with_range(kvm, &range);
> > > >  }
> > > >
> > > > +/* Flush all memory mapped by the given direct SP. */
> > > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> > > > +{
> > > > +   WARN_ON_ONCE(!sp->role.direct);
> > > > +   kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
> > > > +                                      KVM_PAGES_PER_HPAGE(sp->role.level + 1));
>
> Do we need "+1" here?    sp->role.level=1 means 4k page.
> I think here should be   “KVM_PAGES_PER_HPAGE(sp->role.level)”

Yes we need the "+ 1" here. kvm_flush_remote_tlbs_direct_sp() must
flush all memory mapped by the shadow page, which is equivalent to the
amount of memory mapped by a huge page at the next higher level. For
example, a shadow page with role.level == PG_LEVEL_4K maps 2 MiB of
the guest physical address space since 512 PTEs x 4KiB per PTE = 2MiB.

>
> > >
> > > nit: I think it would make sense to introduce
> > > kvm_flush_remote_tlbs_gfn() in this patch since you are going to
> > > eventually use it here anyway.
> > >
> > OK, I'll do it in the next version. Thanks!
> >
> > > > +}
> > > > +
> > > >  static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
> > > >                        unsigned int access)
> > > >  {
> > > > @@ -2341,7 +2349,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> > > >                     return;
> > > >
> > > >             drop_parent_pte(child, sptep);
> > > > -           kvm_flush_remote_tlbs_with_address(vcpu->kvm, child->gfn, 1);
> > > > +           kvm_flush_remote_tlbs_direct_sp(vcpu->kvm, child);
> > > >     }
> > > >  }
> > > >
> > > > --
> > > > 2.31.1
> > > >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-18 13:11   ` Robert Hoo
@ 2022-09-20 18:32     ` David Matlack
  2022-09-20 18:44       ` David Matlack
  0 siblings, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-20 18:32 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Hou Wenlong, kvm, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Lan Tianyu, linux-kernel

On Sun, Sep 18, 2022 at 09:11:00PM +0800, Robert Hoo wrote:
> On Wed, 2022-08-24 at 17:29 +0800, Hou Wenlong wrote:
> > The spte pointing to the children SP is dropped, so the
> > whole gfn range covered by the children SP should be flushed.
> > Although, Hyper-V may treat a 1-page flush the same if the
> > address points to a huge page, it still would be better
> > to use the correct size of huge page. Also introduce
> > a helper function to do range-based flushing when a direct
> > SP is dropped, which would help prevent future buggy use
> > of kvm_flush_remote_tlbs_with_address() in such case.
> > 
> > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new
> > one to flush a specified range.")
> > Suggested-by: David Matlack <dmatlack@google.com>
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e418ef3ecfcb..a3578abd8bbc 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct
> > kvm *kvm,
> >  	kvm_flush_remote_tlbs_with_range(kvm, &range);
> >  }
> >  
> > +/* Flush all memory mapped by the given direct SP. */
> > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct
> > kvm_mmu_page *sp)
> > +{
> > +	WARN_ON_ONCE(!sp->role.direct);
> 
> What if !sp->role.direct? Below flushing sp->gfn isn't expected? but
> still to do it. Is this operation harmless? 

Flushing TLBs is always harmless because KVM cannot ever assume an entry is
in the TLB. However, *not* (properly) flushing TLBs can be harmful. If KVM ever
calls kvm_flush_remote_tlbs_direct_sp() with an indirect SP, that is a bug in
KVM. The TLB flush here won't be harmful, as I explained, but KVM will miss a
TLB flush.

That being said, I don't think any changes here are necessary.
kvm_flush_remote_tlbs_direct_sp() only has one caller, validate_direct_spte(),
which only operates on direct SPs. The name of the function also makes it
obvious this should only be called with a direct SP. And if we ever mess this
up in the future, we'll see the WARN_ON().

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-20 18:32     ` David Matlack
@ 2022-09-20 18:44       ` David Matlack
  2022-09-27  2:54         ` Robert Hoo
  0 siblings, 1 reply; 24+ messages in thread
From: David Matlack @ 2022-09-20 18:44 UTC (permalink / raw)
  To: Robert Hoo
  Cc: Hou Wenlong, kvm list, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Lan Tianyu, LKML

On Tue, Sep 20, 2022 at 11:32 AM David Matlack <dmatlack@google.com> wrote:
>
> On Sun, Sep 18, 2022 at 09:11:00PM +0800, Robert Hoo wrote:
> > On Wed, 2022-08-24 at 17:29 +0800, Hou Wenlong wrote:
> > > The spte pointing to the children SP is dropped, so the
> > > whole gfn range covered by the children SP should be flushed.
> > > Although, Hyper-V may treat a 1-page flush the same if the
> > > address points to a huge page, it still would be better
> > > to use the correct size of huge page. Also introduce
> > > a helper function to do range-based flushing when a direct
> > > SP is dropped, which would help prevent future buggy use
> > > of kvm_flush_remote_tlbs_with_address() in such case.
> > >
> > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new
> > > one to flush a specified range.")
> > > Suggested-by: David Matlack <dmatlack@google.com>
> > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct
> > > kvm *kvm,
> > >     kvm_flush_remote_tlbs_with_range(kvm, &range);
> > >  }
> > >
> > > +/* Flush all memory mapped by the given direct SP. */
> > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct
> > > kvm_mmu_page *sp)
> > > +{
> > > +   WARN_ON_ONCE(!sp->role.direct);
> >
> > What if !sp->role.direct? Below flushing sp->gfn isn't expected? but
> > still to do it. Is this operation harmless?
>
> Flushing TLBs is always harmless because KVM cannot ever assume an entry is
> in the TLB. However, *not* (properly) flushing TLBs can be harmful. If KVM ever
> calls kvm_flush_remote_tlbs_direct_sp() with an indirect SP, that is a bug in
> KVM. The TLB flush here won't be harmful, as I explained, but KVM will miss a
> TLB flush.
>
> That being said, I don't think any changes here are necessary.
> kvm_flush_remote_tlbs_direct_sp() only has one caller, validate_direct_spte(),
> which only operates on direct SPs. The name of the function also makes it
> obvious this should only be called with a direct SP. And if we ever mess this
> up in the future, we'll see the WARN_ON().

That being said, we might as well replace the WARN_ON_ONCE() with
KVM_BUG_ON(). That will still do a WARN_ON_ONCE() but has the added
benefit of terminating the VM.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-20 18:44       ` David Matlack
@ 2022-09-27  2:54         ` Robert Hoo
  2022-09-27 16:44           ` Sean Christopherson
  0 siblings, 1 reply; 24+ messages in thread
From: Robert Hoo @ 2022-09-27  2:54 UTC (permalink / raw)
  To: David Matlack
  Cc: Hou Wenlong, kvm list, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Lan Tianyu, LKML

On Tue, 2022-09-20 at 11:44 -0700, David Matlack wrote:
> On Tue, Sep 20, 2022 at 11:32 AM David Matlack <dmatlack@google.com>
> wrote:
> > 
> > On Sun, Sep 18, 2022 at 09:11:00PM +0800, Robert Hoo wrote:
> > > On Wed, 2022-08-24 at 17:29 +0800, Hou Wenlong wrote:
> > > > The spte pointing to the children SP is dropped, so the
> > > > whole gfn range covered by the children SP should be flushed.
> > > > Although, Hyper-V may treat a 1-page flush the same if the
> > > > address points to a huge page, it still would be better
> > > > to use the correct size of huge page. Also introduce
> > > > a helper function to do range-based flushing when a direct
> > > > SP is dropped, which would help prevent future buggy use
> > > > of kvm_flush_remote_tlbs_with_address() in such case.
> > > > 
> > > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with
> > > > new
> > > > one to flush a specified range.")
> > > > Suggested-by: David Matlack <dmatlack@google.com>
> > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > > ---
> > > >  arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -260,6 +260,14 @@ void
> > > > kvm_flush_remote_tlbs_with_address(struct
> > > > kvm *kvm,
> > > >     kvm_flush_remote_tlbs_with_range(kvm, &range);
> > > >  }
> > > > 
> > > > +/* Flush all memory mapped by the given direct SP. */
> > > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm,
> > > > struct
> > > > kvm_mmu_page *sp)
> > > > +{
> > > > +   WARN_ON_ONCE(!sp->role.direct);
> > > 
> > > What if !sp->role.direct? Below flushing sp->gfn isn't expected?
> > > but
> > > still to do it. Is this operation harmless?
> > 
> > Flushing TLBs is always harmless because KVM cannot ever assume an
> > entry is
> > in the TLB. However, *not* (properly) flushing TLBs can be harmful.
> > If KVM ever
> > calls kvm_flush_remote_tlbs_direct_sp() with an indirect SP, that
> > is a bug in
> > KVM. The TLB flush here won't be harmful, as I explained, but KVM
> > will miss a
> > TLB flush.
> > 
Yes, agree, not harmful, a cost of TLB miss, thanks.

> > That being said, I don't think any changes here are necessary.
> > kvm_flush_remote_tlbs_direct_sp() only has one caller,
> > validate_direct_spte(),
> > which only operates on direct SPs. The name of the function also
> > makes it
> > obvious this should only be called with a direct SP. And if we ever
> > mess this
> > up in the future, we'll see the WARN_ON().
> 
> That being said, we might as well replace the WARN_ON_ONCE() with
> KVM_BUG_ON(). That will still do a WARN_ON_ONCE() but has the added
> benefit of terminating the VM.

Yeah, here was my point, WARN_ON_ONCE() might not be warning obviously
enough, as it usually for recoverable cases. But terminating VM is also
over action I think. Just my 2 cents, the whole patch is good.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()
  2022-09-27  2:54         ` Robert Hoo
@ 2022-09-27 16:44           ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2022-09-27 16:44 UTC (permalink / raw)
  To: Robert Hoo
  Cc: David Matlack, Hou Wenlong, kvm list, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Lan Tianyu, LKML

On Tue, Sep 27, 2022, Robert Hoo wrote:
> On Tue, 2022-09-20 at 11:44 -0700, David Matlack wrote:
> > That being said, we might as well replace the WARN_ON_ONCE() with
> > KVM_BUG_ON(). That will still do a WARN_ON_ONCE() but has the added
> > benefit of terminating the VM.
> 
> Yeah, here was my point, WARN_ON_ONCE() might not be warning obviously
> enough, as it usually for recoverable cases. But terminating VM is also
> over action I think.

Agreed, from the helper's perspective, killing the VM is unnecessary, e.g. it can
simply flush the entire TLB as a fallback.

	if (WARN_ON_ONCE(!sp->role.direct)) {
		kvm_flush_remote_tlbs(kvm);
		return;
	}

But looking at the series as a whole, I think the better option is to just not
introduce this helper.  There's exactly one user, and if that path encounters an
indirect shadow page then KVM is deep in the weeds.  But that's a moot point,
because unless I'm misreading the flow, there's no need for the unique helper.
kvm_flush_remote_tlbs_sptep() will do the right thing even if the target SP happens
to be indirect.

If we really want to assert that the child is a direct shadow page, then
validate_direct_spte() would be the right location for such an assert.  IMO that's
unnecessary paranoia.  The odds of KVM reaching this point with a completely messed
up shadow paging tree, and without already having hosed the guest and/or yelled
loudly are very, very low.

In other words, IMO we're getting too fancy for this one and we should instead
simply do:

		child = to_shadow_page(*sptep & SPTE_BASE_ADDR_MASK);
		if (child->role.access == direct_access)
			return;

		drop_parent_pte(child, sptep);
		kvm_flush_remote_tlbs_sptep(kvm, sptep);

That has the added benefit of flushing the same "thing" that is zapped, i.e. both
operate on the parent SPTE, not the child.

Note, kvm_flush_remote_tlbs_sptep() operates on the _parent_, where the above
snippets retrieves the child.  This becomes much more obvious once spte_to_child_sp()
comes along: https://lore.kernel.org/all/20220805230513.148869-8-seanjc@google.com.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-09-27 16:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1661331396.git.houwenlong.hwl@antgroup.com>
2022-08-24  9:29 ` [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte() Hou Wenlong
2022-09-07 17:43   ` David Matlack
2022-09-13 12:07     ` Hou Wenlong
2022-09-15 11:47       ` Liam Ni
2022-09-16  2:49         ` Hou Wenlong
2022-09-20 18:08         ` David Matlack
2022-09-18 13:11   ` Robert Hoo
2022-09-20 18:32     ` David Matlack
2022-09-20 18:44       ` David Matlack
2022-09-27  2:54         ` Robert Hoo
2022-09-27 16:44           ` Sean Christopherson
2022-08-24  9:29 ` [PATCH v2 2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp() Hou Wenlong
2022-09-07 17:50   ` David Matlack
2022-08-24  9:29 ` [PATCH v2 3/6] KVM: x86/mmu: Reduce gfn range of tlb flushing in tdp_mmu_map_handle_target_level() Hou Wenlong
2022-09-07 17:58   ` David Matlack
2022-08-24  9:29 ` [PATCH v2 4/6] KVM: x86/mmu: Fix wrong start gfn of tlb flushing with range Hou Wenlong
2022-09-07 18:25   ` David Matlack
2022-09-13 12:50     ` Hou Wenlong
2022-08-24  9:29 ` [PATCH v2 5/6] KVM: x86/mmu: Introduce helper function to do range-based flushing for given page Hou Wenlong
2022-08-24  9:29 ` [PATCH v2 6/6] KVM: x86/mmu: Use 1 as the size of gfn range for tlb flushing in FNAME(invlpg)() Hou Wenlong
2022-09-07 17:40   ` David Matlack
2022-09-13 12:58     ` Hou Wenlong
2022-09-13 13:57       ` David Matlack
2022-09-16 19:33         ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).