Re: [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks
@ 2021-06-16 23:15 kernel test robot
  0 siblings, 0 replies; 3+ messages in thread
From: kernel test robot @ 2021-06-16 23:15 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 3244 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210611235701.3941724-5-dmatlack@google.com>
References: <20210611235701.3941724-5-dmatlack@google.com>
TO: David Matlack <dmatlack@google.com>
TO: kvm(a)vger.kernel.org
CC: Ben Gardon <bgardon@google.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Jim Mattson <jmattson@google.com>
CC: Wanpeng Li <wanpengli@tencent.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Sean Christopherson <seanjc@google.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Junaid Shahid <junaids@google.com>
CC: Andrew Jones <drjones@redhat.com>

Hi David,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.13-rc6 next-20210616]
[cannot apply to kvm/queue vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/David-Matlack/KVM-x86-mmu-Fast-page-fault-support-for-the-TDP-MMU/20210617-013501
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6b00bc639f1f2beeff3595e1bab9faaa51d23b01
:::::: branch date: 6 hours ago
:::::: commit date: 6 hours ago
config: x86_64-randconfig-s021-20210615 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/0050bf7241bac2324a20ff442b58158d18b08c13
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review David-Matlack/KVM-x86-mmu-Fast-page-fault-support-for-the-TDP-MMU/20210617-013501
        git checkout 0050bf7241bac2324a20ff442b58158d18b08c13
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   arch/x86/kvm/mmu/tdp_mmu.c:275:9: sparse: sparse: context imbalance in 'tdp_mmu_link_page' - different lock contexts for basic block
   arch/x86/kvm/mmu/tdp_mmu.c:300:9: sparse: sparse: context imbalance in 'tdp_mmu_unlink_page' - different lock contexts for basic block
>> arch/x86/kvm/mmu/tdp_mmu.c:1507:6: sparse: sparse: context imbalance in 'kvm_tdp_mmu_walk_lockless_begin' - wrong count at exit
   arch/x86/kvm/mmu/tdp_mmu.c:1512:6: sparse: sparse: context imbalance in 'kvm_tdp_mmu_walk_lockless_end' - unexpected unlock

vim +/kvm_tdp_mmu_walk_lockless_begin +1507 arch/x86/kvm/mmu/tdp_mmu.c

46044f72c3826b Ben Gardon    2020-10-14  1506  
0050bf7241bac2 David Matlack 2021-06-11 @1507  void kvm_tdp_mmu_walk_lockless_begin(void)
0050bf7241bac2 David Matlack 2021-06-11  1508  {
0050bf7241bac2 David Matlack 2021-06-11  1509  	rcu_read_lock();
0050bf7241bac2 David Matlack 2021-06-11  1510  }
0050bf7241bac2 David Matlack 2021-06-11  1511  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37588 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks
  2021-06-11 23:56 ` [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks David Matlack
@ 2021-06-14 17:56   ` Ben Gardon
  0 siblings, 0 replies; 3+ messages in thread
From: Ben Gardon @ 2021-06-14 17:56 UTC (permalink / raw)
  To: David Matlack
  Cc: kvm, Joerg Roedel, Jim Mattson, Wanpeng Li, Vitaly Kuznetsov,
	Sean Christopherson, Paolo Bonzini, Junaid Shahid, Andrew Jones

On Fri, Jun 11, 2021 at 4:57 PM David Matlack <dmatlack@google.com> wrote:
>
> Introduce a common API for walking the shadow page tables locklessly
> that abstracts away whether the TDP MMU is enabled or not. This will be
> used in a follow-up patch to support the TDP MMU in fast_page_fault.
>
> The API can be used as follows:
>
>   struct shadow_page_walk walk;
>
>   walk_shadow_page_lockless_begin(vcpu);
>   if (!walk_shadow_page_lockless(vcpu, addr, &walk))
>     goto out;
>
>   ... use `walk` ...
>
> out:
>   walk_shadow_page_lockless_end(vcpu);
>
> Note: Separating walk_shadow_page_lockless_begin() from
> walk_shadow_page_lockless() seems superfluous at first glance but is
> needed to support fast_page_fault() since it performs multiple walks
> under the same begin/end block.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>

Reviewed-by: Ben Gardon <bgardon@google.com>


> ---
>  arch/x86/kvm/mmu/mmu.c          | 96 ++++++++++++++++++++-------------
>  arch/x86/kvm/mmu/mmu_internal.h | 15 ++++++
>  arch/x86/kvm/mmu/tdp_mmu.c      | 34 ++++++------
>  arch/x86/kvm/mmu/tdp_mmu.h      |  6 ++-
>  4 files changed, 96 insertions(+), 55 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 1d0fe1445e04..8140c262f4d3 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -623,6 +623,11 @@ static bool mmu_spte_age(u64 *sptep)
>
>  static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
>  {
> +       if (is_vcpu_using_tdp_mmu(vcpu)) {
> +               kvm_tdp_mmu_walk_lockless_begin();
> +               return;
> +       }
> +
>         /*
>          * Prevent page table teardown by making any free-er wait during
>          * kvm_flush_remote_tlbs() IPI to all active vcpus.
> @@ -638,6 +643,11 @@ static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
>
>  static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
>  {
> +       if (is_vcpu_using_tdp_mmu(vcpu)) {
> +               kvm_tdp_mmu_walk_lockless_end();
> +               return;
> +       }
> +
>         /*
>          * Make sure the write to vcpu->mode is not reordered in front of
>          * reads to sptes.  If it does, kvm_mmu_commit_zap_page() can see us
> @@ -3501,59 +3511,61 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>  }
>
>  /*
> - * Return the level of the lowest level SPTE added to sptes.
> - * That SPTE may be non-present.
> + * Walks the shadow page table for the given address until a leaf or non-present
> + * spte is encountered.
> + *
> + * Returns false if no walk could be performed, in which case `walk` does not
> + * contain any valid data.
> + *
> + * Must be called between walk_shadow_page_lockless_{begin,end}.
>   */
> -static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
> +static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
> +                                     struct shadow_page_walk *walk)
>  {
> -       struct kvm_shadow_walk_iterator iterator;
> -       int leaf = -1;
> +       struct kvm_shadow_walk_iterator it;
> +       bool walk_ok = false;
>         u64 spte;
>
> -       walk_shadow_page_lockless_begin(vcpu);
> +       if (is_vcpu_using_tdp_mmu(vcpu))
> +               return kvm_tdp_mmu_walk_lockless(vcpu, addr, walk);
>
> -       for (shadow_walk_init(&iterator, vcpu, addr),
> -            *root_level = iterator.level;
> -            shadow_walk_okay(&iterator);
> -            __shadow_walk_next(&iterator, spte)) {
> -               leaf = iterator.level;
> -               spte = mmu_spte_get_lockless(iterator.sptep);
> +       shadow_walk_init(&it, vcpu, addr);
> +       walk->root_level = it.level;
>
> -               sptes[leaf] = spte;
> +       for (; shadow_walk_okay(&it); __shadow_walk_next(&it, spte)) {
> +               walk_ok = true;
> +
> +               spte = mmu_spte_get_lockless(it.sptep);
> +               walk->last_level = it.level;
> +               walk->sptes[it.level] = spte;
>
>                 if (!is_shadow_present_pte(spte))
>                         break;
>         }
>
> -       walk_shadow_page_lockless_end(vcpu);
> -
> -       return leaf;
> +       return walk_ok;
>  }
>
>  /* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
>  static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  {
> -       u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
> +       struct shadow_page_walk walk;
>         struct rsvd_bits_validate *rsvd_check;
> -       int root, leaf, level;
> +       int last_level, level;
>         bool reserved = false;
>
> -       if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
> -               *sptep = 0ull;
> +       *sptep = 0ull;
> +
> +       if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
>                 return reserved;
> -       }
>
> -       if (is_vcpu_using_tdp_mmu(vcpu))
> -               leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
> -       else
> -               leaf = get_walk(vcpu, addr, sptes, &root);
> +       walk_shadow_page_lockless_begin(vcpu);
>
> -       if (unlikely(leaf < 0)) {
> -               *sptep = 0ull;
> -               return reserved;
> -       }
> +       if (!walk_shadow_page_lockless(vcpu, addr, &walk))
> +               goto out;
>
> -       *sptep = sptes[leaf];
> +       last_level = walk.last_level;
> +       *sptep = walk.sptes[last_level];
>
>         /*
>          * Skip reserved bits checks on the terminal leaf if it's not a valid
> @@ -3561,29 +3573,37 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>          * design, always have reserved bits set.  The purpose of the checks is
>          * to detect reserved bits on non-MMIO SPTEs. i.e. buggy SPTEs.
>          */
> -       if (!is_shadow_present_pte(sptes[leaf]))
> -               leaf++;
> +       if (!is_shadow_present_pte(walk.sptes[last_level]))
> +               last_level++;
>
>         rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
>
> -       for (level = root; level >= leaf; level--)
> +       for (level = walk.root_level; level >= last_level; level--) {
> +               u64 spte = walk.sptes[level];
> +
>                 /*
>                  * Use a bitwise-OR instead of a logical-OR to aggregate the
>                  * reserved bit and EPT's invalid memtype/XWR checks to avoid
>                  * adding a Jcc in the loop.
>                  */
> -               reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
> -                           __is_rsvd_bits_set(rsvd_check, sptes[level], level);
> +               reserved |= __is_bad_mt_xwr(rsvd_check, spte) |
> +                           __is_rsvd_bits_set(rsvd_check, spte, level);
> +       }
>
>         if (reserved) {
>                 pr_err("%s: reserved bits set on MMU-present spte, addr 0x%llx, hierarchy:\n",
>                        __func__, addr);
> -               for (level = root; level >= leaf; level--)
> +               for (level = walk.root_level; level >= last_level; level--) {
> +                       u64 spte = walk.sptes[level];
> +
>                         pr_err("------ spte = 0x%llx level = %d, rsvd bits = 0x%llx",
> -                              sptes[level], level,
> -                              rsvd_check->rsvd_bits_mask[(sptes[level] >> 7) & 1][level-1]);
> +                              spte, level,
> +                              rsvd_check->rsvd_bits_mask[(spte >> 7) & 1][level-1]);
> +               }
>         }
>
> +out:
> +       walk_shadow_page_lockless_end(vcpu);
>         return reserved;
>  }
>
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index d64ccb417c60..26da6ca30fbf 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -165,4 +165,19 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
>  void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
>  void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
>
> +struct shadow_page_walk {
> +       /* The level of the root spte in the walk. */
> +       int root_level;
> +
> +       /*
> +        * The level of the last spte in the walk. The last spte is either the
> +        * leaf of the walk (which may or may not be present) or the first
> +        * non-present spte encountered during the walk.
> +        */
> +       int last_level;
> +
> +       /* The spte value at each level. */
> +       u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
> +};
> +
>  #endif /* __KVM_X86_MMU_INTERNAL_H */
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index f4cc79dabeae..36f4844a5f95 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1504,28 +1504,32 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
>         return spte_set;
>  }
>
> -/*
> - * Return the level of the lowest level SPTE added to sptes.
> - * That SPTE may be non-present.
> - */
> -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> -                        int *root_level)
> +void kvm_tdp_mmu_walk_lockless_begin(void)
> +{
> +       rcu_read_lock();
> +}
> +
> +void kvm_tdp_mmu_walk_lockless_end(void)
> +{
> +       rcu_read_unlock();
> +}
> +
> +bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
> +                              struct shadow_page_walk *walk)
>  {
>         struct tdp_iter iter;
>         struct kvm_mmu *mmu = vcpu->arch.mmu;
>         gfn_t gfn = addr >> PAGE_SHIFT;
> -       int leaf = -1;
> +       bool walk_ok = false;
>
> -       *root_level = vcpu->arch.mmu->shadow_root_level;
> -
> -       rcu_read_lock();
> +       walk->root_level = vcpu->arch.mmu->shadow_root_level;
>
>         tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
> -               leaf = iter.level;
> -               sptes[leaf] = iter.old_spte;
> -       }
> +               walk_ok = true;
>
> -       rcu_read_unlock();
> +               walk->last_level = iter.level;
> +               walk->sptes[iter.level] = iter.old_spte;
> +       }
>
> -       return leaf;
> +       return walk_ok;
>  }
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
> index c8cf12809fcf..772d11bbb92a 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.h
> +++ b/arch/x86/kvm/mmu/tdp_mmu.h
> @@ -76,8 +76,10 @@ bool kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
>  bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
>                                    struct kvm_memory_slot *slot, gfn_t gfn);
>
> -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> -                        int *root_level);
> +void kvm_tdp_mmu_walk_lockless_begin(void);
> +void kvm_tdp_mmu_walk_lockless_end(void);
> +bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
> +                              struct shadow_page_walk *walk);
>
>  #ifdef CONFIG_X86_64
>  void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
> --
> 2.32.0.272.g935e593368-goog
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks
  2021-06-11 23:56 [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU David Matlack
@ 2021-06-11 23:56 ` David Matlack
  2021-06-14 17:56   ` Ben Gardon
  0 siblings, 1 reply; 3+ messages in thread
From: David Matlack @ 2021-06-11 23:56 UTC (permalink / raw)
  To: kvm
  Cc: Ben Gardon, Joerg Roedel, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Sean Christopherson, Paolo Bonzini,
	Junaid Shahid, Andrew Jones, David Matlack

Introduce a common API for walking the shadow page tables locklessly
that abstracts away whether the TDP MMU is enabled or not. This will be
used in a follow-up patch to support the TDP MMU in fast_page_fault.

The API can be used as follows:

  struct shadow_page_walk walk;

  walk_shadow_page_lockless_begin(vcpu);
  if (!walk_shadow_page_lockless(vcpu, addr, &walk))
    goto out;

  ... use `walk` ...

out:
  walk_shadow_page_lockless_end(vcpu);

Note: Separating walk_shadow_page_lockless_begin() from
walk_shadow_page_lockless() seems superfluous at first glance but is
needed to support fast_page_fault() since it performs multiple walks
under the same begin/end block.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 96 ++++++++++++++++++++-------------
 arch/x86/kvm/mmu/mmu_internal.h | 15 ++++++
 arch/x86/kvm/mmu/tdp_mmu.c      | 34 ++++++------
 arch/x86/kvm/mmu/tdp_mmu.h      |  6 ++-
 4 files changed, 96 insertions(+), 55 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1d0fe1445e04..8140c262f4d3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -623,6 +623,11 @@ static bool mmu_spte_age(u64 *sptep)
 
 static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 {
+	if (is_vcpu_using_tdp_mmu(vcpu)) {
+		kvm_tdp_mmu_walk_lockless_begin();
+		return;
+	}
+
 	/*
 	 * Prevent page table teardown by making any free-er wait during
 	 * kvm_flush_remote_tlbs() IPI to all active vcpus.
@@ -638,6 +643,11 @@ static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 
 static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 {
+	if (is_vcpu_using_tdp_mmu(vcpu)) {
+		kvm_tdp_mmu_walk_lockless_end();
+		return;
+	}
+
 	/*
 	 * Make sure the write to vcpu->mode is not reordered in front of
 	 * reads to sptes.  If it does, kvm_mmu_commit_zap_page() can see us
@@ -3501,59 +3511,61 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 }
 
 /*
- * Return the level of the lowest level SPTE added to sptes.
- * That SPTE may be non-present.
+ * Walks the shadow page table for the given address until a leaf or non-present
+ * spte is encountered.
+ *
+ * Returns false if no walk could be performed, in which case `walk` does not
+ * contain any valid data.
+ *
+ * Must be called between walk_shadow_page_lockless_{begin,end}.
  */
-static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
+static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
+				      struct shadow_page_walk *walk)
 {
-	struct kvm_shadow_walk_iterator iterator;
-	int leaf = -1;
+	struct kvm_shadow_walk_iterator it;
+	bool walk_ok = false;
 	u64 spte;
 
-	walk_shadow_page_lockless_begin(vcpu);
+	if (is_vcpu_using_tdp_mmu(vcpu))
+		return kvm_tdp_mmu_walk_lockless(vcpu, addr, walk);
 
-	for (shadow_walk_init(&iterator, vcpu, addr),
-	     *root_level = iterator.level;
-	     shadow_walk_okay(&iterator);
-	     __shadow_walk_next(&iterator, spte)) {
-		leaf = iterator.level;
-		spte = mmu_spte_get_lockless(iterator.sptep);
+	shadow_walk_init(&it, vcpu, addr);
+	walk->root_level = it.level;
 
-		sptes[leaf] = spte;
+	for (; shadow_walk_okay(&it); __shadow_walk_next(&it, spte)) {
+		walk_ok = true;
+
+		spte = mmu_spte_get_lockless(it.sptep);
+		walk->last_level = it.level;
+		walk->sptes[it.level] = spte;
 
 		if (!is_shadow_present_pte(spte))
 			break;
 	}
 
-	walk_shadow_page_lockless_end(vcpu);
-
-	return leaf;
+	return walk_ok;
 }
 
 /* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
 static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
-	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+	struct shadow_page_walk walk;
 	struct rsvd_bits_validate *rsvd_check;
-	int root, leaf, level;
+	int last_level, level;
 	bool reserved = false;
 
-	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
-		*sptep = 0ull;
+	*sptep = 0ull;
+
+	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
 		return reserved;
-	}
 
-	if (is_vcpu_using_tdp_mmu(vcpu))
-		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
-	else
-		leaf = get_walk(vcpu, addr, sptes, &root);
+	walk_shadow_page_lockless_begin(vcpu);
 
-	if (unlikely(leaf < 0)) {
-		*sptep = 0ull;
-		return reserved;
-	}
+	if (!walk_shadow_page_lockless(vcpu, addr, &walk))
+		goto out;
 
-	*sptep = sptes[leaf];
+	last_level = walk.last_level;
+	*sptep = walk.sptes[last_level];
 
 	/*
 	 * Skip reserved bits checks on the terminal leaf if it's not a valid
@@ -3561,29 +3573,37 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	 * design, always have reserved bits set.  The purpose of the checks is
 	 * to detect reserved bits on non-MMIO SPTEs. i.e. buggy SPTEs.
 	 */
-	if (!is_shadow_present_pte(sptes[leaf]))
-		leaf++;
+	if (!is_shadow_present_pte(walk.sptes[last_level]))
+		last_level++;
 
 	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
 
-	for (level = root; level >= leaf; level--)
+	for (level = walk.root_level; level >= last_level; level--) {
+		u64 spte = walk.sptes[level];
+
 		/*
 		 * Use a bitwise-OR instead of a logical-OR to aggregate the
 		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
 		 * adding a Jcc in the loop.
 		 */
-		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
-			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
+		reserved |= __is_bad_mt_xwr(rsvd_check, spte) |
+			    __is_rsvd_bits_set(rsvd_check, spte, level);
+	}
 
 	if (reserved) {
 		pr_err("%s: reserved bits set on MMU-present spte, addr 0x%llx, hierarchy:\n",
 		       __func__, addr);
-		for (level = root; level >= leaf; level--)
+		for (level = walk.root_level; level >= last_level; level--) {
+			u64 spte = walk.sptes[level];
+
 			pr_err("------ spte = 0x%llx level = %d, rsvd bits = 0x%llx",
-			       sptes[level], level,
-			       rsvd_check->rsvd_bits_mask[(sptes[level] >> 7) & 1][level-1]);
+			       spte, level,
+			       rsvd_check->rsvd_bits_mask[(spte >> 7) & 1][level-1]);
+		}
 	}
 
+out:
+	walk_shadow_page_lockless_end(vcpu);
 	return reserved;
 }
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index d64ccb417c60..26da6ca30fbf 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -165,4 +165,19 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
+struct shadow_page_walk {
+	/* The level of the root spte in the walk. */
+	int root_level;
+
+	/*
+	 * The level of the last spte in the walk. The last spte is either the
+	 * leaf of the walk (which may or may not be present) or the first
+	 * non-present spte encountered during the walk.
+	 */
+	int last_level;
+
+	/* The spte value at each level. */
+	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+};
+
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f4cc79dabeae..36f4844a5f95 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1504,28 +1504,32 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
 	return spte_set;
 }
 
-/*
- * Return the level of the lowest level SPTE added to sptes.
- * That SPTE may be non-present.
- */
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
-			 int *root_level)
+void kvm_tdp_mmu_walk_lockless_begin(void)
+{
+	rcu_read_lock();
+}
+
+void kvm_tdp_mmu_walk_lockless_end(void)
+{
+	rcu_read_unlock();
+}
+
+bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
+			       struct shadow_page_walk *walk)
 {
 	struct tdp_iter iter;
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	gfn_t gfn = addr >> PAGE_SHIFT;
-	int leaf = -1;
+	bool walk_ok = false;
 
-	*root_level = vcpu->arch.mmu->shadow_root_level;
-
-	rcu_read_lock();
+	walk->root_level = vcpu->arch.mmu->shadow_root_level;
 
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
-		leaf = iter.level;
-		sptes[leaf] = iter.old_spte;
-	}
+		walk_ok = true;
 
-	rcu_read_unlock();
+		walk->last_level = iter.level;
+		walk->sptes[iter.level] = iter.old_spte;
+	}
 
-	return leaf;
+	return walk_ok;
 }
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index c8cf12809fcf..772d11bbb92a 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -76,8 +76,10 @@ bool kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
 bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
 				   struct kvm_memory_slot *slot, gfn_t gfn);
 
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
-			 int *root_level);
+void kvm_tdp_mmu_walk_lockless_begin(void);
+void kvm_tdp_mmu_walk_lockless_end(void);
+bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
+			       struct shadow_page_walk *walk);
 
 #ifdef CONFIG_X86_64
 void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
-- 
2.32.0.272.g935e593368-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-16 23:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 23:15 [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-06-11 23:56 [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU David Matlack
2021-06-11 23:56 ` [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks David Matlack
2021-06-14 17:56   ` Ben Gardon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.