All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Feiner <pfeiner@google.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jon Cargille <jcargill@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, kvm list <kvm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] kvm: x86 mmu: avoid mmu_page_hash lookup for direct_map-only VM
Date: Tue, 12 May 2020 15:36:21 -0700	[thread overview]
Message-ID: <CAM3pwhEw+KYq9AD+z8wPGyG10Bex7xLKaPM=yVV-H+W_eHTW4w@mail.gmail.com> (raw)
In-Reply-To: <20200508201355.GS27052@linux.intel.com>

On Fri, May 8, 2020 at 1:14 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Fri, May 08, 2020 at 11:24:25AM -0700, Jon Cargille wrote:
> > From: Peter Feiner <pfeiner@google.com>
> >
> > Optimization for avoiding lookups in mmu_page_hash. When there's a
> > single direct root, a shadow page has at most one parent SPTE
> > (non-root SPs have exactly one; the root has none). Thus, if an SPTE
> > is non-present, it can be linked to a newly allocated SP without
> > first checking if the SP already exists.
>
> Some mechanical comments below.  I'll think through the actual logic next
> week, my brain needs to be primed anytime the MMU is involved :-)
>
> > This optimization has proven significant in batch large SP shattering
> > where the hash lookup accounted for 95% of the overhead.
> >
> > Signed-off-by: Peter Feiner <pfeiner@google.com>
> > Signed-off-by: Jon Cargille <jcargill@google.com>
> > Reviewed-by: Jim Mattson <jmattson@google.com>
> >
> > ---
> >  arch/x86/include/asm/kvm_host.h | 13 ++++++++
> >  arch/x86/kvm/mmu/mmu.c          | 55 +++++++++++++++++++--------------
> >  2 files changed, 45 insertions(+), 23 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index a239a297be33..9b70d764b626 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -913,6 +913,19 @@ struct kvm_arch {
> >       struct kvm_page_track_notifier_node mmu_sp_tracker;
> >       struct kvm_page_track_notifier_head track_notifier_head;
> >
> > +     /*
> > +      * Optimization for avoiding lookups in mmu_page_hash. When there's a
> > +      * single direct root, a shadow page has at most one parent SPTE
> > +      * (non-root SPs have exactly one; the root has none). Thus, if an SPTE
> > +      * is non-present, it can be linked to a newly allocated SP without
> > +      * first checking if the SP already exists.
> > +      *
> > +      * False initially because there are no indirect roots.
> > +      *
> > +      * Guarded by mmu_lock.
> > +      */
> > +     bool shadow_page_may_have_multiple_parents;
>
> Why make this a one-way bool?  Wouldn't it be better to let this transition
> back to '0' once all nested guests go away?

I made it one way because I didn't know how the shadow MMU worked in
2015 :-) I was concerned about not quite getting the transition back
to '0' at the right point. I.e., what's the necessary set of
conditions where we never have to look for a parent SP? Is it just
when there are no indirect roots? Or could we be building some
internal part of the tree despite there not being roots? TBH, now that
it's been 12 months since I last thought _hard_ about the KVM MMU,
it'd take some time for me to review these questions.

>
> And maybe a shorter name that reflects what it tracks instead of how its
> used, e.g. has_indirect_mmu or indirect_mmu_count.

Good idea.

>
> > +
> >       struct list_head assigned_dev_head;
> >       struct iommu_domain *iommu_domain;
> >       bool iommu_noncoherent;
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e618472c572b..d94552b0ed77 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2499,35 +2499,40 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> >               quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
> >               role.quadrant = quadrant;
> >       }
> > -     for_each_valid_sp(vcpu->kvm, sp, gfn) {
> > -             if (sp->gfn != gfn) {
> > -                     collisions++;
> > -                     continue;
> > -             }
> >
> > -             if (!need_sync && sp->unsync)
> > -                     need_sync = true;
> > +     if (vcpu->kvm->arch.shadow_page_may_have_multiple_parents ||
> > +         level == vcpu->arch.mmu->root_level) {
>
> Might be worth a goto to preserve the for-loop.

Or factor out the guts of the loop into a function.

>
> > +             for_each_valid_sp(vcpu->kvm, sp, gfn) {
> > +                     if (sp->gfn != gfn) {
> > +                             collisions++;
> > +                             continue;
> > +                     }
> >
> > -             if (sp->role.word != role.word)
> > -                     continue;
> > +                     if (!need_sync && sp->unsync)
> > +                             need_sync = true;
> >
> > -             if (sp->unsync) {
> > -                     /* The page is good, but __kvm_sync_page might still end
> > -                      * up zapping it.  If so, break in order to rebuild it.
> > -                      */
> > -                     if (!__kvm_sync_page(vcpu, sp, &invalid_list))
> > -                             break;
> > +                     if (sp->role.word != role.word)
> > +                             continue;
> >
> > -                     WARN_ON(!list_empty(&invalid_list));
> > -                     kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> > -             }
> > +                     if (sp->unsync) {
> > +                             /* The page is good, but __kvm_sync_page might
> > +                              * still end up zapping it.  If so, break in
> > +                              * order to rebuild it.
> > +                              */
> > +                             if (!__kvm_sync_page(vcpu, sp, &invalid_list))
> > +                                     break;
> >
> > -             if (sp->unsync_children)
> > -                     kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> > +                             WARN_ON(!list_empty(&invalid_list));
> > +                             kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> > +                     }
> >
> > -             __clear_sp_write_flooding_count(sp);
> > -             trace_kvm_mmu_get_page(sp, false);
> > -             goto out;
> > +                     if (sp->unsync_children)
> > +                             kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> > +
> > +                     __clear_sp_write_flooding_count(sp);
> > +                     trace_kvm_mmu_get_page(sp, false);
> > +                     goto out;
> > +             }
> >       }
> >
> >       ++vcpu->kvm->stat.mmu_cache_miss;
> > @@ -3735,6 +3740,10 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
> >       gfn_t root_gfn, root_pgd;
> >       int i;
> >
> > +     spin_lock(&vcpu->kvm->mmu_lock);
> > +     vcpu->kvm->arch.shadow_page_may_have_multiple_parents = true;
> > +     spin_unlock(&vcpu->kvm->mmu_lock);
>
> Taking the lock every time is unnecessary, even if this is changed to a
> refcount type variable, e.g.
>
>         if (!has_indirect_mmu) {
>                 lock_and_set
>         }
>
> or
>
>         if (atomic_inc_return(&indirect_mmu_count) == 1)
>                 lock_and_unlock;
>
>

Indeed. Good suggestion.

> > +
> >       root_pgd = vcpu->arch.mmu->get_guest_pgd(vcpu);
> >       root_gfn = root_pgd >> PAGE_SHIFT;
> >
> > --
> > 2.26.2.303.gf8c07b1a785-goog
> >

  reply	other threads:[~2020-05-12 22:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08 18:24 [PATCH] kvm: x86 mmu: avoid mmu_page_hash lookup for direct_map-only VM Jon Cargille
2020-05-08 20:13 ` Sean Christopherson
2020-05-12 22:36   ` Peter Feiner [this message]
2020-06-23  6:53     ` Sean Christopherson
2020-06-23 15:59       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM3pwhEw+KYq9AD+z8wPGyG10Bex7xLKaPM=yVV-H+W_eHTW4w@mail.gmail.com' \
    --to=pfeiner@google.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jcargill@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.