kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
@ 2021-11-15 21:17 Ben Gardon
  2021-11-16  0:03 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Ben Gardon @ 2021-11-15 21:17 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, Peter Shier,
	David Matlack, Mingwei Zhang, Yulei Zhang, Wanpeng Li,
	Xiao Guangrong, Kai Huang, Keqian Zhu, David Hildenbrand,
	Ben Gardon, stable

When recursively clearing out disconnected pts, the range based TLB
flush in handle_removed_tdp_mmu_page uses the wrong starting GFN,
resulting in the flush mostly missing the affected range. Fix this by
using base_gfn for the flush.

In response to feedback from David Matlack on the RFC version of this
patch, also move a few definitions into the for loop in the function to
prevent unintended references to them in the future.

Fixes: a066e61f13cf ("KVM: x86/mmu: Factor out handling of removed page tables")
CC: stable@vger.kernel.org

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 7c5dd83e52de..4bd541050d21 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -317,9 +317,6 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
 	struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
 	int level = sp->role.level;
 	gfn_t base_gfn = sp->gfn;
-	u64 old_child_spte;
-	u64 *sptep;
-	gfn_t gfn;
 	int i;
 
 	trace_kvm_mmu_prepare_zap_page(sp);
@@ -327,8 +324,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
 	tdp_mmu_unlink_page(kvm, sp, shared);
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
-		sptep = rcu_dereference(pt) + i;
-		gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
+		u64 *sptep = rcu_dereference(pt) + i;
+		gfn_t gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
+		u64 old_child_spte;
 
 		if (shared) {
 			/*
@@ -374,7 +372,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
 				    shared);
 	}
 
-	kvm_flush_remote_tlbs_with_address(kvm, gfn,
+	kvm_flush_remote_tlbs_with_address(kvm, base_gfn,
 					   KVM_PAGES_PER_HPAGE(level + 1));
 
 	call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
  2021-11-15 21:17 [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt Ben Gardon
@ 2021-11-16  0:03 ` Sean Christopherson
  2021-11-16 17:29   ` Ben Gardon
  2021-11-30  1:24   ` David Matlack
  0 siblings, 2 replies; 5+ messages in thread
From: Sean Christopherson @ 2021-11-16  0:03 UTC (permalink / raw)
  To: Ben Gardon
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Matlack, Mingwei Zhang, Yulei Zhang, Wanpeng Li,
	Xiao Guangrong, Kai Huang, Keqian Zhu, David Hildenbrand, stable

On Mon, Nov 15, 2021, Ben Gardon wrote:
> When recursively clearing out disconnected pts, the range based TLB
> flush in handle_removed_tdp_mmu_page uses the wrong starting GFN,
> resulting in the flush mostly missing the affected range. Fix this by
> using base_gfn for the flush.
> 
> In response to feedback from David Matlack on the RFC version of this
> patch, also move a few definitions into the for loop in the function to
> prevent unintended references to them in the future.

Rats, I didn't read David's feedback or I would've responded there.

> Fixes: a066e61f13cf ("KVM: x86/mmu: Factor out handling of removed page tables")
> CC: stable@vger.kernel.org
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 7c5dd83e52de..4bd541050d21 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -317,9 +317,6 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>  	struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
>  	int level = sp->role.level;
>  	gfn_t base_gfn = sp->gfn;
> -	u64 old_child_spte;
> -	u64 *sptep;
> -	gfn_t gfn;
>  	int i;
>  
>  	trace_kvm_mmu_prepare_zap_page(sp);
> @@ -327,8 +324,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>  	tdp_mmu_unlink_page(kvm, sp, shared);
>  
>  	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> -		sptep = rcu_dereference(pt) + i;
> -		gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> +		u64 *sptep = rcu_dereference(pt) + i;
> +		gfn_t gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> +		u64 old_child_spte;

TL;DR: this type of optional refactoring doesn't belong in a patch Cc'd for stable,
and my personal preference is to always declare variables at function scope (it's
not a hard rule though, Paolo has overruled me at least once :-) ).

Declaring variables in an inner scope is not always "better".  In particular, it
can lead to variable shadowing, which can lead to functional issues of a different
sort.  Most shadowing is fairly obvious, and truly egregious bugs will often result
in the compiler complaining about consuming an uninitialized variable.

But the worst-case scenario is if the inner scope shadows a function parameter, in
which the case the compiler will not complain and will even consume an uninitialized
variable without warning.  IIRC, we actually had a Hyper-V bug of that nature
where an incoming @vcpu was shadowed.  Examples below.

So yes, on one hand moving the declarations inside the loop avoid potential flavor
of bug, but they create the possibility for an entirely different class of bugs.
The main reason I prefer declaring at function scope is that I find it easier to
visually detect using variables after a for loop, versus detecting that a variable
is being shadowed, especially if the function is largish and the two declarations
don't fit on the screen.

There are of course counter-examples, e.g. commit 5c49d1850ddd ("KVM: VMX: Fix a
TSX_CTRL_CPUID_CLEAR field mask issue") immediately jumps to mind, so there's
certainly an element of personal preference.

E.g. this will fail with "error: ‘sptep’ redeclared as different kind of symbol

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 4e226cdb40d9..011639bf633c 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -369,7 +369,7 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
  * early rcu_dereferences in the function.
  */
 static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
-                                       bool shared)
+                                       bool shared, u64 *sptep)
 {
        struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
        int level = sp->role.level;
@@ -431,8 +431,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
                                    shared);
        }

-       kvm_flush_remote_tlbs_with_address(kvm, gfn,
-                                          KVM_PAGES_PER_HPAGE(level + 1));
+       if (sptep)
+               kvm_flush_remote_tlbs_with_address(kvm, gfn,
+                                                  KVM_PAGES_PER_HPAGE(level + 1));

        call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
 }
@@ -532,7 +533,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
         */
        if (was_present && !was_leaf && (is_leaf || !is_present))
                handle_removed_tdp_mmu_page(kvm,
-                               spte_to_child_pt(old_spte, level), shared);
+                               spte_to_child_pt(old_spte, level), shared, NULL);
 }

 static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,


whereas moving the second declaration into the loop will compile happily.

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 4e226cdb40d9..3e83fd66c0dc 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -369,13 +369,12 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
  * early rcu_dereferences in the function.
  */
 static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
-                                       bool shared)
+                                       bool shared, u64 *sptep)
 {
        struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
        int level = sp->role.level;
        gfn_t base_gfn = sp->gfn;
        u64 old_child_spte;
-       u64 *sptep;
        gfn_t gfn;
        int i;

@@ -384,7 +383,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
        tdp_mmu_unlink_page(kvm, sp, shared);

        for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
-               sptep = rcu_dereference(pt) + i;
+               u64 *sptep = rcu_dereference(pt) + i;
                gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);

                if (shared) {
@@ -431,8 +430,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
                                    shared);
        }

-       kvm_flush_remote_tlbs_with_address(kvm, gfn,
-                                          KVM_PAGES_PER_HPAGE(level + 1));
+       if (sptep)
+               kvm_flush_remote_tlbs_with_address(kvm, gfn,
+                                                  KVM_PAGES_PER_HPAGE(level + 1));

        call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
 }
@@ -532,7 +532,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
         */
        if (was_present && !was_leaf && (is_leaf || !is_present))
                handle_removed_tdp_mmu_page(kvm,
-                               spte_to_child_pt(old_spte, level), shared);
+                               spte_to_child_pt(old_spte, level), shared, NULL);
 }

 static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
  2021-11-16  0:03 ` Sean Christopherson
@ 2021-11-16 17:29   ` Ben Gardon
  2021-11-16 17:55     ` Paolo Bonzini
  2021-11-30  1:24   ` David Matlack
  1 sibling, 1 reply; 5+ messages in thread
From: Ben Gardon @ 2021-11-16 17:29 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Peter Shier,
	David Matlack, Mingwei Zhang, Yulei Zhang, Wanpeng Li,
	Xiao Guangrong, Kai Huang, Keqian Zhu, David Hildenbrand, stable

On Mon, Nov 15, 2021 at 4:03 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Nov 15, 2021, Ben Gardon wrote:
> > When recursively clearing out disconnected pts, the range based TLB
> > flush in handle_removed_tdp_mmu_page uses the wrong starting GFN,
> > resulting in the flush mostly missing the affected range. Fix this by
> > using base_gfn for the flush.
> >
> > In response to feedback from David Matlack on the RFC version of this
> > patch, also move a few definitions into the for loop in the function to
> > prevent unintended references to them in the future.
>
> Rats, I didn't read David's feedback or I would've responded there.
>
> > Fixes: a066e61f13cf ("KVM: x86/mmu: Factor out handling of removed page tables")
> > CC: stable@vger.kernel.org
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  arch/x86/kvm/mmu/tdp_mmu.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index 7c5dd83e52de..4bd541050d21 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -317,9 +317,6 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> >       struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
> >       int level = sp->role.level;
> >       gfn_t base_gfn = sp->gfn;
> > -     u64 old_child_spte;
> > -     u64 *sptep;
> > -     gfn_t gfn;
> >       int i;
> >
> >       trace_kvm_mmu_prepare_zap_page(sp);
> > @@ -327,8 +324,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> >       tdp_mmu_unlink_page(kvm, sp, shared);
> >
> >       for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> > -             sptep = rcu_dereference(pt) + i;
> > -             gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> > +             u64 *sptep = rcu_dereference(pt) + i;
> > +             gfn_t gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> > +             u64 old_child_spte;
>
> TL;DR: this type of optional refactoring doesn't belong in a patch Cc'd for stable,
> and my personal preference is to always declare variables at function scope (it's
> not a hard rule though, Paolo has overruled me at least once :-) ).

That makes sense. I don't have a preference either way. Paolo, if you
want the version without the refactor, the version I sent in the RFC
should be good. If the refactor is desired, I can separate it out into
another patch and send a v2 of this patch as a mini series, tagging
only the fix for stable.

I've generally preferred declaring variables at function scope too
since that seems like the overwhelming convention, but it's always
struck me as a bit of a waste to not make use of scoping rules more.
It does make it nice and clear how things should be laid out when
debugging the kernel with GDB or something though.

In any case, please let me know how you'd like the changes organized
and I can send up follow ups as needed, or we can just move forward
with the RFC version.

>
> Declaring variables in an inner scope is not always "better".  In particular, it
> can lead to variable shadowing, which can lead to functional issues of a different
> sort.  Most shadowing is fairly obvious, and truly egregious bugs will often result
> in the compiler complaining about consuming an uninitialized variable.
>
> But the worst-case scenario is if the inner scope shadows a function parameter, in
> which the case the compiler will not complain and will even consume an uninitialized
> variable without warning.  IIRC, we actually had a Hyper-V bug of that nature
> where an incoming @vcpu was shadowed.  Examples below.
>
> So yes, on one hand moving the declarations inside the loop avoid potential flavor
> of bug, but they create the possibility for an entirely different class of bugs.
> The main reason I prefer declaring at function scope is that I find it easier to
> visually detect using variables after a for loop, versus detecting that a variable
> is being shadowed, especially if the function is largish and the two declarations
> don't fit on the screen.
>
> There are of course counter-examples, e.g. commit 5c49d1850ddd ("KVM: VMX: Fix a
> TSX_CTRL_CPUID_CLEAR field mask issue") immediately jumps to mind, so there's
> certainly an element of personal preference.
>
> E.g. this will fail with "error: ‘sptep’ redeclared as different kind of symbol
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 4e226cdb40d9..011639bf633c 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -369,7 +369,7 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
>   * early rcu_dereferences in the function.
>   */
>  static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> -                                       bool shared)
> +                                       bool shared, u64 *sptep)
>  {
>         struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
>         int level = sp->role.level;
> @@ -431,8 +431,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>                                     shared);
>         }
>
> -       kvm_flush_remote_tlbs_with_address(kvm, gfn,
> -                                          KVM_PAGES_PER_HPAGE(level + 1));
> +       if (sptep)
> +               kvm_flush_remote_tlbs_with_address(kvm, gfn,
> +                                                  KVM_PAGES_PER_HPAGE(level + 1));
>
>         call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -532,7 +533,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>          */
>         if (was_present && !was_leaf && (is_leaf || !is_present))
>                 handle_removed_tdp_mmu_page(kvm,
> -                               spte_to_child_pt(old_spte, level), shared);
> +                               spte_to_child_pt(old_spte, level), shared, NULL);
>  }
>
>  static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>
>
> whereas moving the second declaration into the loop will compile happily.
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 4e226cdb40d9..3e83fd66c0dc 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -369,13 +369,12 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
>   * early rcu_dereferences in the function.
>   */
>  static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> -                                       bool shared)
> +                                       bool shared, u64 *sptep)
>  {
>         struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
>         int level = sp->role.level;
>         gfn_t base_gfn = sp->gfn;
>         u64 old_child_spte;
> -       u64 *sptep;
>         gfn_t gfn;
>         int i;
>
> @@ -384,7 +383,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>         tdp_mmu_unlink_page(kvm, sp, shared);
>
>         for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> -               sptep = rcu_dereference(pt) + i;
> +               u64 *sptep = rcu_dereference(pt) + i;
>                 gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
>
>                 if (shared) {
> @@ -431,8 +430,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>                                     shared);
>         }
>
> -       kvm_flush_remote_tlbs_with_address(kvm, gfn,
> -                                          KVM_PAGES_PER_HPAGE(level + 1));
> +       if (sptep)
> +               kvm_flush_remote_tlbs_with_address(kvm, gfn,
> +                                                  KVM_PAGES_PER_HPAGE(level + 1));
>
>         call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -532,7 +532,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>          */
>         if (was_present && !was_leaf && (is_leaf || !is_present))
>                 handle_removed_tdp_mmu_page(kvm,
> -                               spte_to_child_pt(old_spte, level), shared);
> +                               spte_to_child_pt(old_spte, level), shared, NULL);
>  }
>
>  static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
  2021-11-16 17:29   ` Ben Gardon
@ 2021-11-16 17:55     ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2021-11-16 17:55 UTC (permalink / raw)
  To: Ben Gardon, Sean Christopherson
  Cc: linux-kernel, kvm, Peter Xu, Peter Shier, David Matlack,
	Mingwei Zhang, Yulei Zhang, Wanpeng Li, Xiao Guangrong,
	Kai Huang, Keqian Zhu, David Hildenbrand, stable

On 11/16/21 18:29, Ben Gardon wrote:
>> TL;DR: this type of optional refactoring doesn't belong in a patch Cc'd for stable,
>> and my personal preference is to always declare variables at function scope (it's
>> not a hard rule though, Paolo has overruled me at least once:-)  ).
>
> That makes sense. I don't have a preference either way. Paolo, if you
> want the version without the refactor, the version I sent in the RFC
> should be good. If the refactor is desired, I can separate it out into
> another patch and send a v2 of this patch as a mini series, tagging
> only the fix for stable.

It's really a damned-if-you-do/damned-if-you-don't situation.  And also 
keeping the patch as similar as possible in stable has the advantage 
that future backports have a slightly lower chance of breaking due to 
shadowed variables.

In the end I agree with both of you :) and in this case I tend to accept 
the patch as written.  So I queued it, though it probably will not be in 
the immediately next pull request.

My plan for the next couple days is to send a pull request and finally 
move the development tree to 5.16-rc1, so that I can push to kvm/next 
all the SVM, memslot and xarray stuff that's pending.  Then I'll go back 
to this one.

Paolo

> I've generally preferred declaring variables at function scope too
> since that seems like the overwhelming convention, but it's always
> struck me as a bit of a waste to not make use of scoping rules more.
> It does make it nice and clear how things should be laid out when
> debugging the kernel with GDB or something though.
> 
> In any case, please let me know how you'd like the changes organized
> and I can send up follow ups as needed, or we can just move forward
> with the RFC version.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
  2021-11-16  0:03 ` Sean Christopherson
  2021-11-16 17:29   ` Ben Gardon
@ 2021-11-30  1:24   ` David Matlack
  1 sibling, 0 replies; 5+ messages in thread
From: David Matlack @ 2021-11-30  1:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ben Gardon, linux-kernel, kvm, Paolo Bonzini, Peter Xu,
	Peter Shier, Mingwei Zhang, Yulei Zhang, Wanpeng Li,
	Xiao Guangrong, Kai Huang, Keqian Zhu, David Hildenbrand, stable

On Mon, Nov 15, 2021 at 4:03 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Nov 15, 2021, Ben Gardon wrote:
> > When recursively clearing out disconnected pts, the range based TLB
> > flush in handle_removed_tdp_mmu_page uses the wrong starting GFN,
> > resulting in the flush mostly missing the affected range. Fix this by
> > using base_gfn for the flush.
> >
> > In response to feedback from David Matlack on the RFC version of this
> > patch, also move a few definitions into the for loop in the function to
> > prevent unintended references to them in the future.
>
> Rats, I didn't read David's feedback or I would've responded there.
>
> > Fixes: a066e61f13cf ("KVM: x86/mmu: Factor out handling of removed page tables")
> > CC: stable@vger.kernel.org
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  arch/x86/kvm/mmu/tdp_mmu.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index 7c5dd83e52de..4bd541050d21 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -317,9 +317,6 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> >       struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
> >       int level = sp->role.level;
> >       gfn_t base_gfn = sp->gfn;
> > -     u64 old_child_spte;
> > -     u64 *sptep;
> > -     gfn_t gfn;
> >       int i;
> >
> >       trace_kvm_mmu_prepare_zap_page(sp);
> > @@ -327,8 +324,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> >       tdp_mmu_unlink_page(kvm, sp, shared);
> >
> >       for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> > -             sptep = rcu_dereference(pt) + i;
> > -             gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> > +             u64 *sptep = rcu_dereference(pt) + i;
> > +             gfn_t gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
> > +             u64 old_child_spte;
>
> TL;DR: this type of optional refactoring doesn't belong in a patch Cc'd for stable,

This is a great point. Regardless of our stance on variable function
declaration, the refactor wouldn't make sense to send to stable (or at
the minimum it should be considered separately in its own patch).

> and my personal preference is to always declare variables at function scope (it's
> not a hard rule though, Paolo has overruled me at least once :-) ).
>
> Declaring variables in an inner scope is not always "better".  In particular, it
> can lead to variable shadowing, which can lead to functional issues of a different
> sort.  Most shadowing is fairly obvious, and truly egregious bugs will often result
> in the compiler complaining about consuming an uninitialized variable.
>
> But the worst-case scenario is if the inner scope shadows a function parameter, in
> which the case the compiler will not complain and will even consume an uninitialized
> variable without warning.  IIRC, we actually had a Hyper-V bug of that nature
> where an incoming @vcpu was shadowed.  Examples below.
>
> So yes, on one hand moving the declarations inside the loop avoid potential flavor
> of bug, but they create the possibility for an entirely different class of bugs.
> The main reason I prefer declaring at function scope is that I find it easier to
> visually detect using variables after a for loop, versus detecting that a variable
> is being shadowed, especially if the function is largish and the two declarations
> don't fit on the screen.

Ah I have not had the pleasure of debugging a variable shadowing bug
before :), so I never even considered this.

This is something that a compiler could easily detect and disallow for
us. In fact GCC has the -Wshadow=local option which disallows local
variables shadowing other local variables and function parameters.
Unfortunately, Clang either doesn't support this option or spells it
differently (but I haven't been able to find it). I tried to build KVM
with -Wshadow=local today but it looks like I need to get a newer
version of GCC and that's as far as I got.

Both Clang and GCC support -Wshadow but that is too broad. It prevents
local variables from shadowing global variables, which would prevent
us from having local variables named "apic" among other common names.

>
> There are of course counter-examples, e.g. commit 5c49d1850ddd ("KVM: VMX: Fix a
> TSX_CTRL_CPUID_CLEAR field mask issue") immediately jumps to mind, so there's
> certainly an element of personal preference.
>
> E.g. this will fail with "error: ‘sptep’ redeclared as different kind of symbol
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 4e226cdb40d9..011639bf633c 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -369,7 +369,7 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
>   * early rcu_dereferences in the function.
>   */
>  static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> -                                       bool shared)
> +                                       bool shared, u64 *sptep)
>  {
>         struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
>         int level = sp->role.level;
> @@ -431,8 +431,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>                                     shared);
>         }
>
> -       kvm_flush_remote_tlbs_with_address(kvm, gfn,
> -                                          KVM_PAGES_PER_HPAGE(level + 1));
> +       if (sptep)
> +               kvm_flush_remote_tlbs_with_address(kvm, gfn,
> +                                                  KVM_PAGES_PER_HPAGE(level + 1));
>
>         call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -532,7 +533,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>          */
>         if (was_present && !was_leaf && (is_leaf || !is_present))
>                 handle_removed_tdp_mmu_page(kvm,
> -                               spte_to_child_pt(old_spte, level), shared);
> +                               spte_to_child_pt(old_spte, level), shared, NULL);
>  }
>
>  static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>
>
> whereas moving the second declaration into the loop will compile happily.
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 4e226cdb40d9..3e83fd66c0dc 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -369,13 +369,12 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
>   * early rcu_dereferences in the function.
>   */
>  static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
> -                                       bool shared)
> +                                       bool shared, u64 *sptep)
>  {
>         struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt));
>         int level = sp->role.level;
>         gfn_t base_gfn = sp->gfn;
>         u64 old_child_spte;
> -       u64 *sptep;
>         gfn_t gfn;
>         int i;
>
> @@ -384,7 +383,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>         tdp_mmu_unlink_page(kvm, sp, shared);
>
>         for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> -               sptep = rcu_dereference(pt) + i;
> +               u64 *sptep = rcu_dereference(pt) + i;
>                 gfn = base_gfn + i * KVM_PAGES_PER_HPAGE(level);
>
>                 if (shared) {
> @@ -431,8 +430,9 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
>                                     shared);
>         }
>
> -       kvm_flush_remote_tlbs_with_address(kvm, gfn,
> -                                          KVM_PAGES_PER_HPAGE(level + 1));
> +       if (sptep)
> +               kvm_flush_remote_tlbs_with_address(kvm, gfn,
> +                                                  KVM_PAGES_PER_HPAGE(level + 1));
>
>         call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -532,7 +532,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>          */
>         if (was_present && !was_leaf && (is_leaf || !is_present))
>                 handle_removed_tdp_mmu_page(kvm,
> -                               spte_to_child_pt(old_spte, level), shared);
> +                               spte_to_child_pt(old_spte, level), shared, NULL);
>  }
>
>  static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-30  1:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 21:17 [PATCH 1/1] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt Ben Gardon
2021-11-16  0:03 ` Sean Christopherson
2021-11-16 17:29   ` Ben Gardon
2021-11-16 17:55     ` Paolo Bonzini
2021-11-30  1:24   ` David Matlack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).