kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Gardon <bgardon@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Hou Wenlong <houwenlong93@linux.alibaba.com>
Subject: Re: [PATCH 24/28] KVM: x86/mmu: Add dedicated helper to zap TDP MMU root shadow page
Date: Mon, 22 Nov 2021 17:04:10 -0800	[thread overview]
Message-ID: <CANgfPd97nEn8WYWEnXPbpJanP=DQ4yh1E3z+x9T5kLX=8ge+WQ@mail.gmail.com> (raw)
In-Reply-To: <20211120045046.3940942-25-seanjc@google.com>

On Fri, Nov 19, 2021 at 8:51 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Convert tdp_mmu_zap_root() into its own dedicated flow instead of simply
> redirecting into zap_gfn_range().  In addition to hardening zapping of
> roots, this will allow future simplification of zap_gfn_range() by having
> it zap only leaf SPTEs, and by removing its tricky "zap all" heuristic.
> By having all paths that truly need to free _all_ SPs flow through the
> dedicated root zapper, the generic zapper can be freed of those concerns.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 91 +++++++++++++++++++++++++++-----------
>  1 file changed, 66 insertions(+), 25 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 99ea19e763da..0e5a0d40e54a 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -53,10 +53,6 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
>         rcu_barrier();
>  }
>
> -static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
> -                         gfn_t start, gfn_t end, bool can_yield, bool flush,
> -                         bool shared);
> -
>  static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
>  {
>         free_page((unsigned long)sp->spt);
> @@ -79,11 +75,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
>         tdp_mmu_free_sp(sp);
>  }
>
> -static bool tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
> -                            bool shared)
> -{
> -       return zap_gfn_range(kvm, root, 0, -1ull, true, false, shared);
> -}
> +static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
> +                            bool shared, bool root_is_unreachable);
>
>  /*
>   * Note, putting a root might sleep, i.e. the caller must have IRQs enabled and
> @@ -120,13 +113,8 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
>          * invalidates any paging-structure-cache entries, i.e. TLB entries for
>          * intermediate paging structures, that may be zapped, as such entries
>          * are associated with the ASID on both VMX and SVM.
> -        *
> -        * WARN if a flush is reported for an invalid root, as its child SPTEs
> -        * should have been zapped by kvm_tdp_mmu_zap_invalidated_roots(), and
> -        * inserting new SPTEs under an invalid root is a KVM bug.
>          */
> -       if (tdp_mmu_zap_root(kvm, root, shared))
> -               WARN_ON_ONCE(root->role.invalid);
> +       tdp_mmu_zap_root(kvm, root, shared, true);
>
>         call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -766,6 +754,65 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
>         return false;
>  }
>
> +static inline gfn_t tdp_mmu_max_gfn_host(void)
> +{
> +       /*
> +        * Bound TDP MMU walks at host.MAXPHYADDR, guest accesses beyond that
> +        * will hit a #PF(RSVD) and never hit an EPT Violation/Misconfig / #NPF,
> +        * and so KVM will never install a SPTE for such addresses.
> +        */
> +       return 1ULL << (shadow_phys_bits - PAGE_SHIFT);
> +}
> +
> +static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
> +                            bool shared, bool root_is_unreachable)
> +{
> +       struct tdp_iter iter;
> +
> +       gfn_t end = tdp_mmu_max_gfn_host();
> +       gfn_t start = 0;
> +
> +       kvm_lockdep_assert_mmu_lock_held(kvm, shared);
> +
> +       rcu_read_lock();
> +
> +       /*
> +        * No need to try to step down in the iterator when zapping an entire
> +        * root, zapping an upper-level SPTE will recurse on its children.
> +        */
> +       for_each_tdp_pte_min_level(iter, root->spt, root->role.level,
> +                                  root->role.level, start, end) {
> +retry:
> +               if (tdp_mmu_iter_cond_resched(kvm, &iter, false, shared))
> +                       continue;
> +
> +               if (!is_shadow_present_pte(iter.old_spte))
> +                       continue;
> +
> +               if (!shared) {
> +                       tdp_mmu_set_spte(kvm, &iter, 0);
> +               } else if (!tdp_mmu_set_spte_atomic(kvm, &iter, 0)) {

Worth adding a comment about why this is used instead of
tdp_mmu_zap_spte_atomic.

> +                       /*
> +                        * cmpxchg() shouldn't fail if the root is unreachable.
> +                        * to be unreachable.  Re-read the SPTE and retry so as

Repeated phrase.


> +                        * not to leak the page and its children.
> +                        */
> +                       WARN_ONCE(root_is_unreachable,
> +                                 "Contended TDP MMU SPTE in unreachable root.");
> +                       iter.old_spte = kvm_tdp_mmu_read_spte(iter.sptep);

Note this will conflict with the series David sent out Friday.
Hopefully some of the cleanups early in that series get merged, in
which case this line will not be needed.

> +                       goto retry;
> +               }
> +               /*
> +                * WARN if the root is invalid and is unreachable, all SPTEs
> +                * should've been zapped by kvm_tdp_mmu_zap_invalidated_roots(),
> +                * and inserting new SPTEs under an invalid root is a KVM bug.
> +                */
> +               WARN_ON_ONCE(root_is_unreachable && root->role.invalid);
> +       }
> +
> +       rcu_read_unlock();
> +}
> +
>  bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
>  {
>         u64 old_spte;
> @@ -807,8 +854,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
>                           gfn_t start, gfn_t end, bool can_yield, bool flush,
>                           bool shared)
>  {
> -       gfn_t max_gfn_host = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
> -       bool zap_all = (start == 0 && end >= max_gfn_host);
> +       bool zap_all = (start == 0 && end >= tdp_mmu_max_gfn_host());
>         struct tdp_iter iter;
>
>         /*
> @@ -817,12 +863,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
>          */
>         int min_level = zap_all ? root->role.level : PG_LEVEL_4K;
>
> -       /*
> -        * Bound the walk at host.MAXPHYADDR, guest accesses beyond that will
> -        * hit a #PF(RSVD) and never get to an EPT Violation/Misconfig / #NPF,
> -        * and so KVM will never install a SPTE for such addresses.
> -        */
> -       end = min(end, max_gfn_host);
> +       end = min(end, tdp_mmu_max_gfn_host());

tdp_mmu_max_gfn_host and this refactor, could be added in a separate
commit if desired.

>
>         kvm_lockdep_assert_mmu_lock_held(kvm, shared);
>
> @@ -898,7 +939,7 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm)
>          */
>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>                 for_each_tdp_mmu_root_yield_safe(kvm, root, i, false)
> -                       (void)tdp_mmu_zap_root(kvm, root, false);
> +                       tdp_mmu_zap_root(kvm, root, false, true);
>         }
>  }
>
> @@ -934,7 +975,7 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm,
>                  * will still flush on yield, but that's a minor performance
>                  * blip and not a functional issue.
>                  */
> -               (void)tdp_mmu_zap_root(kvm, root, true);
> +               tdp_mmu_zap_root(kvm, root, true, false);
>                 kvm_tdp_mmu_put_root(kvm, root, true);
>         }
>  }
> --
> 2.34.0.rc2.393.gf8c9666880-goog
>

  reply	other threads:[~2021-11-23  1:04 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20  4:50 [PATCH 00/28] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing Sean Christopherson
2021-11-20  4:50 ` [PATCH 01/28] KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping Sean Christopherson
2021-11-22 19:48   ` Ben Gardon
2021-11-30  8:03   ` Paolo Bonzini
2021-11-20  4:50 ` [PATCH 02/28] KVM: x86/mmu: Skip tlb flush if it has been done in zap_gfn_range() Sean Christopherson
2021-11-20  4:50 ` [PATCH 03/28] KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path Sean Christopherson
2021-11-20  4:50 ` [PATCH 04/28] KVM: x86/mmu: Retry page fault if root is invalidated by memslot update Sean Christopherson
2021-11-22 19:54   ` Ben Gardon
2021-12-01 20:49   ` Paolo Bonzini
2021-12-08 19:17   ` Sean Christopherson
2021-11-20  4:50 ` [PATCH 05/28] KVM: x86/mmu: Check for present SPTE when clearing dirty bit in TDP MMU Sean Christopherson
2021-11-22 19:57   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 06/28] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic Sean Christopherson
2021-11-20  4:50 ` [PATCH 07/28] KVM: x86/mmu: Document that zapping invalidated roots doesn't need to flush Sean Christopherson
2021-11-20  4:50 ` [PATCH 08/28] KVM: x86/mmu: Drop unused @kvm param from kvm_tdp_mmu_get_root() Sean Christopherson
2021-11-22 20:02   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 09/28] KVM: x86/mmu: Require mmu_lock be held for write in unyielding root iter Sean Christopherson
2021-11-22 20:10   ` Ben Gardon
2021-11-22 20:19     ` Sean Christopherson
2021-11-20  4:50 ` [PATCH 10/28] KVM: x86/mmu: Allow yielding when zapping GFNs for defunct TDP MMU root Sean Christopherson
2021-11-22 21:30   ` Ben Gardon
2021-11-22 22:40     ` Sean Christopherson
2021-11-22 23:03       ` Ben Gardon
2021-12-14 23:45     ` Sean Christopherson
2021-12-14 23:52       ` Sean Christopherson
2021-11-20  4:50 ` [PATCH 11/28] KVM: x86/mmu: Check for !leaf=>leaf, not PFN change, in TDP MMU SP removal Sean Christopherson
2021-11-20  4:50 ` [PATCH 12/28] KVM: x86/mmu: Batch TLB flushes from TDP MMU for MMU notifier change_spte Sean Christopherson
2021-11-22 21:45   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 13/28] KVM: x86/mmu: Drop RCU after processing each root in MMU notifier hooks Sean Christopherson
2021-11-22 21:47   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 14/28] KVM: x86/mmu: Add helpers to read/write TDP MMU SPTEs and document RCU Sean Christopherson
2021-11-22 21:55   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 15/28] KVM: x86/mmu: Take TDP MMU roots off list when invalidating all roots Sean Christopherson
2021-11-22 22:20   ` Ben Gardon
2021-11-22 23:08     ` Sean Christopherson
2021-11-23  0:03       ` Ben Gardon
2021-12-14 23:34         ` Sean Christopherson
2021-11-20  4:50 ` [PATCH 16/28] KVM: x86/mmu: WARN if old _or_ new SPTE is REMOVED in non-atomic path Sean Christopherson
2021-11-22 21:57   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 17/28] KVM: x86/mmu: Terminate yield-friendly walk if invalid root observed Sean Christopherson
2021-11-22 22:25   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 18/28] KVM: x86/mmu: Refactor low-level TDP MMU set SPTE helper to take raw vals Sean Christopherson
2021-11-22 22:29   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 19/28] KVM: x86/mmu: Zap only the target TDP MMU shadow page in NX recovery Sean Christopherson
2021-11-22 22:43   ` Ben Gardon
2021-11-23  1:16     ` Sean Christopherson
2021-11-23 19:35       ` Ben Gardon
2021-11-20  4:50 ` [PATCH 20/28] KVM: x86/mmu: Use common TDP MMU zap helper for MMU notifier unmap hook Sean Christopherson
2021-11-22 22:49   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 21/28] KVM: x86/mmu: Add TDP MMU helper to zap a root Sean Christopherson
2021-11-22 22:54   ` Ben Gardon
2021-11-22 23:15     ` Sean Christopherson
2021-11-22 23:38       ` Ben Gardon
2021-11-20  4:50 ` [PATCH 22/28] KVM: x86/mmu: Skip remote TLB flush when zapping all of TDP MMU Sean Christopherson
2021-11-22 23:00   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 23/28] KVM: x86/mmu: Use "zap root" path for "slow" zap of all TDP MMU SPTEs Sean Christopherson
2021-11-20  4:50 ` [PATCH 24/28] KVM: x86/mmu: Add dedicated helper to zap TDP MMU root shadow page Sean Christopherson
2021-11-23  1:04   ` Ben Gardon [this message]
2021-11-20  4:50 ` [PATCH 25/28] KVM: x86/mmu: Require mmu_lock be held for write to zap TDP MMU range Sean Christopherson
2021-11-23 19:58   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 26/28] KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range() Sean Christopherson
2021-11-23 19:58   ` Ben Gardon
2021-11-20  4:50 ` [PATCH 27/28] KVM: x86/mmu: Do remote TLB flush before dropping RCU in TDP MMU resched Sean Christopherson
2021-11-23 19:58   ` Ben Gardon
2021-11-24 18:42     ` Sean Christopherson
2021-11-30 11:29   ` Paolo Bonzini
2021-11-30 15:45     ` Sean Christopherson
2021-11-30 16:16       ` Paolo Bonzini
2021-11-20  4:50 ` [PATCH 28/28] KVM: x86/mmu: Defer TLB flush to caller when freeing TDP MMU shadow pages Sean Christopherson
2021-11-23 20:12   ` Ben Gardon
2021-12-01 17:53 ` [PATCH 00/28] KVM: x86/mmu: Overhaul TDP MMU zapping and flushing David Matlack
2021-12-02  2:03   ` Sean Christopherson
2021-12-03  0:16     ` David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANgfPd97nEn8WYWEnXPbpJanP=DQ4yh1E3z+x9T5kLX=8ge+WQ@mail.gmail.com' \
    --to=bgardon@google.com \
    --cc=houwenlong93@linux.alibaba.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).