stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] KVM: x86: allow kvm_tdp_mmu_zap_invalidated_roots with write-locked mmu_lock
       [not found] <20211213112514.78552-1-pbonzini@redhat.com>
@ 2021-12-13 11:25 ` Paolo Bonzini
  2021-12-13 11:25 ` [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all Paolo Bonzini
  1 sibling, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2021-12-13 11:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: seanjc, ignat, bgardon, dmatlack, stevensd, kernel-team, stable

Zapping within a write-side critical section is more efficient, so it is
desirable if we know that no vCPU is running (such as within the .release
MMU notifier callback).  Prepare for reusing kvm_tdp_mmu_zap_invalidated_roots
in such scenarios.

Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c     |  2 +-
 arch/x86/kvm/mmu/tdp_mmu.c | 17 +++++++++--------
 arch/x86/kvm/mmu/tdp_mmu.h |  2 +-
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4a3bcdd3cfe7..6fe4ab8fc0ca 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5695,7 +5695,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 
 	if (is_tdp_mmu_enabled(kvm)) {
 		read_lock(&kvm->mmu_lock);
-		kvm_tdp_mmu_zap_invalidated_roots(kvm);
+		kvm_tdp_mmu_zap_invalidated_roots(kvm, true);
 		read_unlock(&kvm->mmu_lock);
 	}
 }
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 1db8496259ad..f2dd5c97bbc2 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -821,14 +821,18 @@ static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm,
  * only has to do a trivial amount of work. Since the roots are invalid,
  * no new SPTEs should be created under them.
  */
-void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
+void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm, bool shared)
 {
 	struct kvm_mmu_page *next_root;
 	struct kvm_mmu_page *root;
 	bool flush = false;
 
-	lockdep_assert_held_read(&kvm->mmu_lock);
+	kvm_lockdep_assert_mmu_lock_held(kvm, shared);
 
+	/*
+	 * rcu_read_lock is only needed for shared == true, but we
+	 * always take it for simplicity.
+	 */
 	rcu_read_lock();
 
 	root = next_invalidated_root(kvm, NULL);
@@ -838,13 +842,10 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
 
 		rcu_read_unlock();
 
-		flush = zap_gfn_range(kvm, root, 0, -1ull, true, flush, true);
+		flush = zap_gfn_range(kvm, root, 0, -1ull, true, flush, shared);
 
-		/*
-		 * Put the reference acquired in
-		 * kvm_tdp_mmu_invalidate_roots
-		 */
-		kvm_tdp_mmu_put_root(kvm, root, true);
+		/* Put the reference acquired in kvm_tdp_mmu_invalidate_roots.  */
+		kvm_tdp_mmu_put_root(kvm, root, shared);
 
 		root = next_root;
 
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 3899004a5d91..24809f4ed090 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -46,7 +46,7 @@ static inline bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
 
 void kvm_tdp_mmu_zap_all(struct kvm *kvm);
 void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm);
-void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm);
+void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm, bool shared);
 
 int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all
       [not found] <20211213112514.78552-1-pbonzini@redhat.com>
  2021-12-13 11:25 ` [PATCH 1/2] KVM: x86: allow kvm_tdp_mmu_zap_invalidated_roots with write-locked mmu_lock Paolo Bonzini
@ 2021-12-13 11:25 ` Paolo Bonzini
  2021-12-13 16:36   ` Sean Christopherson
  1 sibling, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2021-12-13 11:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: seanjc, ignat, bgardon, dmatlack, stevensd, kernel-team, stable

kvm_tdp_mmu_zap_all is intended to visit all roots and zap their page
tables, which flushes the accessed and dirty bits out to the Linux
"struct page"s.  Missing some of the roots has catastrophic effects,
because kvm_tdp_mmu_zap_all is called when the MMU notifier is being
removed and any PTEs left behind might become dangling by the time
kvm-arch_destroy_vm tears down the roots for good.

Unfortunately that is exactly what kvm_tdp_mmu_zap_all is doing: it
visits all roots via for_each_tdp_mmu_root_yield_safe, which in turn
uses kvm_tdp_mmu_get_root to skip invalid roots.  If the current root is
invalid at the time of kvm_tdp_mmu_zap_all, its page tables will remain
in place but will later be zapped during kvm_arch_destroy_vm.

To fix this, ensure that kvm_tdp_mmu_zap_all goes over all roots,
including the invalid ones.  The easiest way to do so is for
kvm_tdp_mmu_zap_all to do the same as kvm_mmu_zap_all_fast: invalidate
all roots, and then zap the invalid roots.  However, there is no need
to go through tdp_mmu_zap_spte_atomic because there are no running vCPUs.

Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
Cc: stable@vger.kernel.org
Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f2dd5c97bbc2..ce3fafb6c9a7 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -779,18 +779,6 @@ bool __kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, int as_id, gfn_t start,
 	return flush;
 }
 
-void kvm_tdp_mmu_zap_all(struct kvm *kvm)
-{
-	bool flush = false;
-	int i;
-
-	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
-		flush = kvm_tdp_mmu_zap_gfn_range(kvm, i, 0, -1ull, flush);
-
-	if (flush)
-		kvm_flush_remote_tlbs(kvm);
-}
-
 static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm,
 						  struct kvm_mmu_page *prev_root)
 {
@@ -888,6 +876,19 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
 			root->role.invalid = true;
 }
 
+void kvm_tdp_mmu_zap_all(struct kvm *kvm)
+{
+	/*
+	 * We need to zap all roots, including already-invalid ones.  The
+	 * easiest way is to ensure there's only invalid roots which then,
+	 * for efficiency, we zap while mmu_lock is taken exclusively.
+	 * Since the MMU notifier is being torn down, contention on the
+	 * mmu_lock is not an issue.
+	 */
+	kvm_tdp_mmu_invalidate_all_roots(kvm);
+	kvm_tdp_mmu_zap_invalidated_roots(kvm, false);
+}
+
 /*
  * Installs a last-level SPTE to handle a TDP page fault.
  * (NPT/EPT violation/misconfiguration)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all
  2021-12-13 11:25 ` [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all Paolo Bonzini
@ 2021-12-13 16:36   ` Sean Christopherson
  2021-12-14 19:45     ` Sean Christopherson
  0 siblings, 1 reply; 4+ messages in thread
From: Sean Christopherson @ 2021-12-13 16:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, ignat, bgardon, dmatlack, stevensd,
	kernel-team, stable

On Mon, Dec 13, 2021, Paolo Bonzini wrote:
> kvm_tdp_mmu_zap_all is intended to visit all roots and zap their page
> tables, which flushes the accessed and dirty bits out to the Linux
> "struct page"s.  Missing some of the roots has catastrophic effects,
> because kvm_tdp_mmu_zap_all is called when the MMU notifier is being
> removed and any PTEs left behind might become dangling by the time
> kvm-arch_destroy_vm tears down the roots for good.
> 
> Unfortunately that is exactly what kvm_tdp_mmu_zap_all is doing: it
> visits all roots via for_each_tdp_mmu_root_yield_safe, which in turn
> uses kvm_tdp_mmu_get_root to skip invalid roots.  If the current root is
> invalid at the time of kvm_tdp_mmu_zap_all, its page tables will remain
> in place but will later be zapped during kvm_arch_destroy_vm.

As stated in the bug report thread[*], it should be impossible as for the MMU
notifier to be unregistered while kvm_mmu_zap_all_fast() is running.

I do believe there's a race between set_nx_huge_pages() and kvm_mmu_notifier_release(),
but that would result in the use-after-free kvm_set_pfn_dirty() tracing back to
set_nx_huge_pages(), not kvm_destroy_vm().  And for that, I would much prefer we
elevant mm->users while changing the NX hugepage setting.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8f0035517450..985df4db8192 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6092,10 +6092,15 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
                mutex_lock(&kvm_lock);

                list_for_each_entry(kvm, &vm_list, vm_list) {
+                       if (!mmget_not_zero(kvm->mm))
+                               continue;
+
                        mutex_lock(&kvm->slots_lock);
                        kvm_mmu_zap_all_fast(kvm);
                        mutex_unlock(&kvm->slots_lock);

+                       mmput_async(kvm->mm);
+
                        wake_up_process(kvm->arch.nx_lpage_recovery_thread);
                }
                mutex_unlock(&kvm_lock);

[*] https://lore.kernel.org/all/Ybdxd7QcJI71UpHm@google.com/

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all
  2021-12-13 16:36   ` Sean Christopherson
@ 2021-12-14 19:45     ` Sean Christopherson
  0 siblings, 0 replies; 4+ messages in thread
From: Sean Christopherson @ 2021-12-14 19:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, ignat, bgardon, dmatlack, stevensd,
	kernel-team, stable

On Mon, Dec 13, 2021, Sean Christopherson wrote:
> On Mon, Dec 13, 2021, Paolo Bonzini wrote:
> > kvm_tdp_mmu_zap_all is intended to visit all roots and zap their page
> > tables, which flushes the accessed and dirty bits out to the Linux
> > "struct page"s.  Missing some of the roots has catastrophic effects,
> > because kvm_tdp_mmu_zap_all is called when the MMU notifier is being
> > removed and any PTEs left behind might become dangling by the time
> > kvm-arch_destroy_vm tears down the roots for good.
> > 
> > Unfortunately that is exactly what kvm_tdp_mmu_zap_all is doing: it
> > visits all roots via for_each_tdp_mmu_root_yield_safe, which in turn
> > uses kvm_tdp_mmu_get_root to skip invalid roots.  If the current root is
> > invalid at the time of kvm_tdp_mmu_zap_all, its page tables will remain
> > in place but will later be zapped during kvm_arch_destroy_vm.
> 
> As stated in the bug report thread[*], it should be impossible as for the MMU
> notifier to be unregistered while kvm_mmu_zap_all_fast() is running.
> 
> I do believe there's a race between set_nx_huge_pages() and kvm_mmu_notifier_release(),
> but that would result in the use-after-free kvm_set_pfn_dirty() tracing back to
> set_nx_huge_pages(), not kvm_destroy_vm().  And for that, I would much prefer we
> elevant mm->users while changing the NX hugepage setting.

Mwhahaha, race confirmed with a bit of hacking to force the issue.  I'll get a
patch out.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-12-14 19:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20211213112514.78552-1-pbonzini@redhat.com>
2021-12-13 11:25 ` [PATCH 1/2] KVM: x86: allow kvm_tdp_mmu_zap_invalidated_roots with write-locked mmu_lock Paolo Bonzini
2021-12-13 11:25 ` [PATCH 2/2] KVM: x86: zap invalid roots in kvm_tdp_mmu_zap_all Paolo Bonzini
2021-12-13 16:36   ` Sean Christopherson
2021-12-14 19:45     ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).