kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Ben Gardon <bgardon@google.com>
Cc: kvm list <kvm@vger.kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Jim Mattson <jmattson@google.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Junaid Shahid <junaids@google.com>,
	Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH 6/8] KVM: x86/mmu: fast_page_fault support for the TDP MMU
Date: Mon, 14 Jun 2021 22:34:42 +0000	[thread overview]
Message-ID: <YMfZgufRGaW//gKE@google.com> (raw)
In-Reply-To: <CANgfPd97iUUduRi-faL4i3fMov7f67iU_LheeJzYoW=QnaHXLg@mail.gmail.com>

On Mon, Jun 14, 2021 at 10:56:47AM -0700, Ben Gardon wrote:
> On Fri, Jun 11, 2021 at 4:59 PM David Matlack <dmatlack@google.com> wrote:
> >
> > On Fri, Jun 11, 2021 at 4:57 PM David Matlack <dmatlack@google.com> wrote:
> > >
> > > This commit enables the fast_page_fault handler to work when the TDP MMU
> > > is enabled by leveraging the new walk_shadow_page_lockless* API to
> > > collect page walks independent of the TDP MMU.
> > >
> > > fast_page_fault was already using
> > > walk_shadow_page_lockless_{begin,end}(), we just have to change the
> > > actual walk to use walk_shadow_page_lockless() which does the right
> > > thing if the TDP MMU is in use.
> > >
> > > Signed-off-by: David Matlack <dmatlack@google.com>
> >
> > Adding this feature was suggested by Ben Gardon:
> >
> > Suggested-by: Ben Gardon <bgardon@google.com>
> 
> Reviewed-by: Ben Gardon <bgardon@google.com>
> 
> >
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 52 +++++++++++++++++-------------------------
> > >  1 file changed, 21 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 765f5b01768d..5562727c3699 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -657,6 +657,9 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
> > >         local_irq_enable();
> > >  }
> > >
> > > +static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
> > > +                                     struct shadow_page_walk *walk);
> > > +
> > >  static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
> > >  {
> > >         int r;
> > > @@ -2967,14 +2970,9 @@ static bool page_fault_can_be_fast(u32 error_code)
> > >   * Returns true if the SPTE was fixed successfully. Otherwise,
> > >   * someone else modified the SPTE from its original value.
> > >   */
> > > -static bool
> > > -fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> > > -                       u64 *sptep, u64 old_spte, u64 new_spte)
> > > +static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, gpa_t gpa,
> > > +                                   u64 *sptep, u64 old_spte, u64 new_spte)
> > >  {
> > > -       gfn_t gfn;
> > > -
> > > -       WARN_ON(!sp->role.direct);
> > > -
> > >         /*
> > >          * Theoretically we could also set dirty bit (and flush TLB) here in
> > >          * order to eliminate unnecessary PML logging. See comments in
> > > @@ -2990,14 +2988,8 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> > >         if (cmpxchg64(sptep, old_spte, new_spte) != old_spte)
> > >                 return false;
> > >
> > > -       if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
> > > -               /*
> > > -                * The gfn of direct spte is stable since it is
> > > -                * calculated by sp->gfn.
> > > -                */
> > > -               gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> > > -               kvm_vcpu_mark_page_dirty(vcpu, gfn);
> > > -       }
> > > +       if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
> > > +               kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
> 
> I love how cleanly you've implemented TDP MMU fast PF support here
> with so little code duplication. I think this approach will work well.
> 
> My only reservation is that this links the way we locklessly change
> (and handle changes to) SPTEs between the TDP and legacy MMUs.
> 
> fast_pf_fix_direct_spte is certainly a much simpler function than
> tdp_mmu_set_spte_atomic, but it might be worth adding a comment
> explaining that the function can modify SPTEs in a TDP MMU paging
> structure. Alternatively, fast_page_fault could call
> tdp_mmu_set_spte_atomic, but I don't know if that would carry a
> performance penalty. On the other hand, the handling for this
> particular type of SPTE modification is unlikely to change, so it
> might not matter.

I'm open to delegating the cmpxchg64 to separate functions for the TDP
and legacy MMU, but didn't see any reason why it would be necessary.
That combined with the fact that it would require a bunch of new code
and helper functions in the TDP MMU (e.g. tdp_mmu_set_spte_atomic has to
be split up further so it doesn't do the lockdep check) led me to just
re-use fast_pf_fix_direct_spte.

I can add a comment though in v2 in fast_pf_fix_direct_spte and another
in tdp_mmu_set_spte_atomic explaining how the two interact.

> 
> 
> 
> > >
> > >         return true;
> > >  }
> > > @@ -3019,10 +3011,9 @@ static bool is_access_allowed(u32 fault_err_code, u64 spte)
> > >   */
> > >  static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
> > >  {
> > > -       struct kvm_shadow_walk_iterator iterator;
> > > -       struct kvm_mmu_page *sp;
> > >         int ret = RET_PF_INVALID;
> > >         u64 spte = 0ull;
> > > +       u64 *sptep = NULL;
> > >         uint retry_count = 0;
> > >
> > >         if (!page_fault_can_be_fast(error_code))
> > > @@ -3031,17 +3022,19 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
> > >         walk_shadow_page_lockless_begin(vcpu);
> > >
> > >         do {
> > > +               struct shadow_page_walk walk;
> > >                 u64 new_spte;
> > >
> > > -               for_each_shadow_entry_lockless(vcpu, gpa, iterator, spte)
> > > -                       if (!is_shadow_present_pte(spte))
> > > -                               break;
> > > +               if (!walk_shadow_page_lockless(vcpu, gpa, &walk))
> > > +                       break;
> > > +
> > > +               spte = walk.sptes[walk.last_level];
> > > +               sptep = walk.spteps[walk.last_level];
> > >
> > >                 if (!is_shadow_present_pte(spte))
> > >                         break;
> > >
> > > -               sp = sptep_to_sp(iterator.sptep);
> > > -               if (!is_last_spte(spte, sp->role.level))
> > > +               if (!is_last_spte(spte, walk.last_level))
> > >                         break;
> > >
> > >                 /*
> > > @@ -3084,7 +3077,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
> > >                          *
> > >                          * See the comments in kvm_arch_commit_memory_region().
> > >                          */
> > > -                       if (sp->role.level > PG_LEVEL_4K)
> > > +                       if (walk.last_level > PG_LEVEL_4K)
> > >                                 break;
> > >                 }
> > >
> > > @@ -3098,8 +3091,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
> > >                  * since the gfn is not stable for indirect shadow page. See
> > >                  * Documentation/virt/kvm/locking.rst to get more detail.
> > >                  */
> > > -               if (fast_pf_fix_direct_spte(vcpu, sp, iterator.sptep, spte,
> > > -                                           new_spte)) {
> > > +               if (fast_pf_fix_direct_spte(vcpu, gpa, sptep, spte, new_spte)) {
> > >                         ret = RET_PF_FIXED;
> > >                         break;
> > >                 }
> > > @@ -3112,7 +3104,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
> > >
> > >         } while (true);
> > >
> > > -       trace_fast_page_fault(vcpu, gpa, error_code, iterator.sptep, spte, ret);
> > > +       trace_fast_page_fault(vcpu, gpa, error_code, sptep, spte, ret);
> > >         walk_shadow_page_lockless_end(vcpu);
> > >
> > >         return ret;
> > > @@ -3748,11 +3740,9 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
> > >         if (page_fault_handle_page_track(vcpu, error_code, gfn))
> > >                 return RET_PF_EMULATE;
> > >
> > > -       if (!is_vcpu_using_tdp_mmu(vcpu)) {
> > > -               r = fast_page_fault(vcpu, gpa, error_code);
> > > -               if (r != RET_PF_INVALID)
> > > -                       return r;
> > > -       }
> > > +       r = fast_page_fault(vcpu, gpa, error_code);
> > > +       if (r != RET_PF_INVALID)
> > > +               return r;
> > >
> > >         r = mmu_topup_memory_caches(vcpu, false);
> > >         if (r)
> > > --
> > > 2.32.0.272.g935e593368-goog
> > >

  reply	other threads:[~2021-06-14 22:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11 23:56 [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU David Matlack
2021-06-11 23:56 ` [PATCH 1/8] KVM: x86/mmu: Refactor is_tdp_mmu_root() David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-14 19:07   ` Sean Christopherson
2021-06-14 21:23     ` David Matlack
2021-06-14 21:39       ` Sean Christopherson
2021-06-14 22:01         ` David Matlack
2021-06-11 23:56 ` [PATCH 2/8] KVM: x86/mmu: Rename cr2_or_gpa to gpa in fast_page_fault David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-11 23:56 ` [PATCH 3/8] KVM: x86/mmu: Fix use of enums in trace_fast_page_fault David Matlack
2021-06-11 23:56 ` [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-11 23:56 ` [PATCH 5/8] KVM: x86/mmu: Also record spteps in shadow_page_walk David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-14 22:27   ` David Matlack
2021-06-14 22:59   ` Sean Christopherson
2021-06-14 23:39     ` David Matlack
2021-06-15  0:22       ` Sean Christopherson
2021-06-11 23:56 ` [PATCH 6/8] KVM: x86/mmu: fast_page_fault support for the TDP MMU David Matlack
2021-06-11 23:59   ` David Matlack
2021-06-14 17:56     ` Ben Gardon
2021-06-14 22:34       ` David Matlack [this message]
2021-06-11 23:57 ` [PATCH 7/8] KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-11 23:57 ` [PATCH 8/8] KVM: selftests: Introduce access_tracking_perf_test David Matlack
2021-06-14 17:56   ` Ben Gardon
2021-06-14 21:47     ` David Matlack
2021-06-14  9:54 ` [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU Paolo Bonzini
2021-06-14 21:08   ` David Matlack
2021-06-15  7:16     ` Paolo Bonzini
2021-06-16 19:27       ` David Matlack
2021-06-16 19:31         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YMfZgufRGaW//gKE@google.com \
    --to=dmatlack@google.com \
    --cc=bgardon@google.com \
    --cc=drjones@redhat.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).