[12/13] KVM: x86/mmu: Fast invalidation for TDP MMU
diff mbox series

Message ID 20210331210841.3996155-13-bgardon@google.com
State New, archived
Headers show
Series
  • More parallel operations for the TDP MMU
Related show

Commit Message

Ben Gardon March 31, 2021, 9:08 p.m. UTC
Provide a real mechanism for fast invalidation by marking roots as
invalid so that their reference count will quickly fall to zero
and they will be torn down.

One negative side affect of this approach is that a vCPU thread will
likely drop the last reference to a root and be saddled with the work of
tearing down an entire paging structure. This issue will be resolved in
a later commit.

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c     |  6 +++---
 arch/x86/kvm/mmu/tdp_mmu.c | 14 ++++++++++++++
 arch/x86/kvm/mmu/tdp_mmu.h |  5 +++++
 3 files changed, 22 insertions(+), 3 deletions(-)

Comments

Sean Christopherson March 31, 2021, 10:27 p.m. UTC | #1
On Wed, Mar 31, 2021, Ben Gardon wrote:
> Provide a real mechanism for fast invalidation by marking roots as
> invalid so that their reference count will quickly fall to zero
> and they will be torn down.
> 
> One negative side affect of this approach is that a vCPU thread will
> likely drop the last reference to a root and be saddled with the work of
> tearing down an entire paging structure. This issue will be resolved in
> a later commit.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---

...

> +/*
> + * This function depends on running in the same MMU lock cirical section as
> + * kvm_reload_remote_mmus. Since this is in the same critical section, no new
> + * roots will be created between this function and the MMU reload signals
> + * being sent.

Eww.  That goes beyond just adding a lockdep assertion here.  I know you want to
isolate the TDP MMU as much as possible, but this really feels like it should be
open coded in kvm_mmu_zap_all_fast().  And assuming this lands after as_id is
added to for_each_tdp_mmu_root(), it's probably easier to open code anyways, e.g.
use list_for_each_entry() directly instead of bouncing through an iterator.

> + */
> +void kvm_tdp_mmu_invalidate_roots(struct kvm *kvm)
> +{
> +	struct kvm_mmu_page *root;
> +
> +	for_each_tdp_mmu_root(kvm, root)
> +		root->role.invalid = true;
> +}
Paolo Bonzini April 1, 2021, 10:36 a.m. UTC | #2
On 31/03/21 23:08, Ben Gardon wrote:
>   
> +	if (is_tdp_mmu_enabled(kvm))
> +		kvm_tdp_mmu_invalidate_roots(kvm);
> +
>   	/*
>   	 * Toggle mmu_valid_gen between '0' and '1'.  Because slots_lock is
>   	 * held for the entire duration of zapping obsolete pages, it's
> @@ -5451,9 +5454,6 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
>   
>   	kvm_zap_obsolete_pages(kvm);
>   
> -	if (is_tdp_mmu_enabled(kvm))
> -		kvm_tdp_mmu_zap_all(kvm);
> -

This is just cosmetic, but I'd prefer to keep the call to 
kvm_tdp_mmu_invalidate_roots at the original place, so that it's clear 
in the next patch that it's two separate parts because of the different 
locking requirements.

Paolo
Ben Gardon April 1, 2021, 4:50 p.m. UTC | #3
On Thu, Apr 1, 2021 at 3:36 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 31/03/21 23:08, Ben Gardon wrote:
> >
> > +     if (is_tdp_mmu_enabled(kvm))
> > +             kvm_tdp_mmu_invalidate_roots(kvm);
> > +
> >       /*
> >        * Toggle mmu_valid_gen between '0' and '1'.  Because slots_lock is
> >        * held for the entire duration of zapping obsolete pages, it's
> > @@ -5451,9 +5454,6 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
> >
> >       kvm_zap_obsolete_pages(kvm);
> >
> > -     if (is_tdp_mmu_enabled(kvm))
> > -             kvm_tdp_mmu_zap_all(kvm);
> > -
>
> This is just cosmetic, but I'd prefer to keep the call to
> kvm_tdp_mmu_invalidate_roots at the original place, so that it's clear
> in the next patch that it's two separate parts because of the different
> locking requirements.

I'm not sure exactly what you mean and I could certainly do a better
job explaining in the commit description, but it's actually quite
important that kvm_tdp_mmu_invalidate_roots at least precede
kvm_zap_obsolete_pages as kvm_zap_obsolete_pages drops the lock and
yields. If kvm_tdp_mmu_invalidate_roots doesn't go first then vCPUs
could wind up dropping their ref on an old root and then picking it up
again before the last root had a chance to drop its ref.

Explaining in the description that kvm_tdp_mmu_zap_all is being
dropped because it is no longer necessary (as opposed to being moved)
might help make that cleaner.

Alternatively I could just leave kvm_tdp_mmu_zap_all and replace it in
the next patch.

>
> Paolo
>
Ben Gardon April 1, 2021, 4:50 p.m. UTC | #4
On Wed, Mar 31, 2021 at 3:27 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Mar 31, 2021, Ben Gardon wrote:
> > Provide a real mechanism for fast invalidation by marking roots as
> > invalid so that their reference count will quickly fall to zero
> > and they will be torn down.
> >
> > One negative side affect of this approach is that a vCPU thread will
> > likely drop the last reference to a root and be saddled with the work of
> > tearing down an entire paging structure. This issue will be resolved in
> > a later commit.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
>
> ...
>
> > +/*
> > + * This function depends on running in the same MMU lock cirical section as
> > + * kvm_reload_remote_mmus. Since this is in the same critical section, no new
> > + * roots will be created between this function and the MMU reload signals
> > + * being sent.
>
> Eww.  That goes beyond just adding a lockdep assertion here.  I know you want to
> isolate the TDP MMU as much as possible, but this really feels like it should be
> open coded in kvm_mmu_zap_all_fast().  And assuming this lands after as_id is
> added to for_each_tdp_mmu_root(), it's probably easier to open code anyways, e.g.
> use list_for_each_entry() directly instead of bouncing through an iterator.

Yeah, that's fair. I'll open-code it here. I agree that it will remove
confusion from the function, though it would be nice to be able to use
for_each_tdp_mmu_root for the lockdep and rcu annotations.


>
> > + */
> > +void kvm_tdp_mmu_invalidate_roots(struct kvm *kvm)
> > +{
> > +     struct kvm_mmu_page *root;
> > +
> > +     for_each_tdp_mmu_root(kvm, root)
> > +             root->role.invalid = true;
> > +}
Paolo Bonzini April 1, 2021, 5:02 p.m. UTC | #5
On 01/04/21 18:50, Ben Gardon wrote:
>> This is just cosmetic, but I'd prefer to keep the call to
>> kvm_tdp_mmu_invalidate_roots at the original place, so that it's clear
>> in the next patch that it's two separate parts because of the different
>> locking requirements.
> I'm not sure exactly what you mean and I could certainly do a better
> job explaining in the commit description, but it's actually quite
> important that kvm_tdp_mmu_invalidate_roots at least precede
> kvm_zap_obsolete_pages as kvm_zap_obsolete_pages drops the lock and
> yields. If kvm_tdp_mmu_invalidate_roots doesn't go first then vCPUs
> could wind up dropping their ref on an old root and then picking it up
> again before the last root had a chance to drop its ref.
> Explaining in the description that kvm_tdp_mmu_zap_all is being
> dropped because it is no longer necessary (as opposed to being moved)
> might help make that cleaner.

No, what would help is the remark you just made about 
kvm_zap_obsolete_pages yielding.  But that doesn't matter after 13/13 
though, does it?  Perhaps it's easier to just combine them.

Paolo

Patch
diff mbox series

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bf535c9f7ff2..49b7097fb55b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5430,6 +5430,9 @@  static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 	write_lock(&kvm->mmu_lock);
 	trace_kvm_mmu_zap_all_fast(kvm);
 
+	if (is_tdp_mmu_enabled(kvm))
+		kvm_tdp_mmu_invalidate_roots(kvm);
+
 	/*
 	 * Toggle mmu_valid_gen between '0' and '1'.  Because slots_lock is
 	 * held for the entire duration of zapping obsolete pages, it's
@@ -5451,9 +5454,6 @@  static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 
 	kvm_zap_obsolete_pages(kvm);
 
-	if (is_tdp_mmu_enabled(kvm))
-		kvm_tdp_mmu_zap_all(kvm);
-
 	write_unlock(&kvm->mmu_lock);
 }
 
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 0c90dc034819..428ff6778426 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -789,6 +789,20 @@  void kvm_tdp_mmu_zap_all(struct kvm *kvm)
 		kvm_flush_remote_tlbs(kvm);
 }
 
+/*
+ * This function depends on running in the same MMU lock cirical section as
+ * kvm_reload_remote_mmus. Since this is in the same critical section, no new
+ * roots will be created between this function and the MMU reload signals
+ * being sent.
+ */
+void kvm_tdp_mmu_invalidate_roots(struct kvm *kvm)
+{
+	struct kvm_mmu_page *root;
+
+	for_each_tdp_mmu_root(kvm, root)
+		root->role.invalid = true;
+}
+
 /*
  * Installs a last-level SPTE to handle a TDP page fault.
  * (NPT/EPT violation/misconfiguration)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 855e58856815..ff4978817fb8 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -10,6 +10,9 @@  hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu);
 __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm *kvm,
 						     struct kvm_mmu_page *root)
 {
+	if (root->role.invalid)
+		return false;
+
 	return refcount_inc_not_zero(&root->tdp_mmu_root_count);
 }
 
@@ -20,6 +23,8 @@  bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end,
 			       bool shared);
 void kvm_tdp_mmu_zap_all(struct kvm *kvm);
 
+void kvm_tdp_mmu_invalidate_roots(struct kvm *kvm);
+
 int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 		    int map_writable, int max_level, kvm_pfn_t pfn,
 		    bool prefault);