kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	"Lai Jiangshan" <laijs@linux.alibaba.com>,
	"Maxim Levitsky" <mlevitsk@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Jim Mattson" <jmattson@google.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH V2] KVM: X86: fix tlb_flush_guest()
Date: Wed, 2 Jun 2021 15:39:47 +0000	[thread overview]
Message-ID: <YLemQ++Xdvh5TVNe@google.com> (raw)
In-Reply-To: <20210531172256.2908-1-jiangshanlai@gmail.com>

On Tue, Jun 01, 2021, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> For KVM_VCPU_FLUSH_TLB used in kvm_flush_tlb_multi(), the guest expects
> the hypervisor do the operation that equals to native_flush_tlb_global()
> or invpcid_flush_all() in the specified guest CPU.

I don't like referencing guest code, here and in the comment.  The paravirt
flush isn't limited to Linux guests, the existing kernel code might change, and
it doesn't directly explain KVM's responsibilities in response to a guest TLB
flush.

Something like:

  When using shadow paging, unload the guest MMU when emulating a guest TLB
  flush to all roots are synchronized.  From the guest's perspective,
  flushing the TLB ensure any and all modifications to its PTEs will be
  recognized by the CPU.

  Note, unloading the MMU is overkill, but is done to mirror KVM's existing
  handling of INVPCID(all) and ensure the bug is squashed.  Future cleanup
  can be done to more precisely synchronize roots when servicing a guest
  TLB flush.
  
> When TDP is enabled, there is no problem to just flush the hardware
> TLB of the specified guest CPU.
> 
> But when using shadowpaging, the hypervisor should have to sync the
> shadow pagetable at first before flushing the hardware TLB so that
> it can truely emulate the operation of invpcid_flush_all() in guest.
> 
> The problem exists since the first implementation of KVM_VCPU_FLUSH_TLB
> in commit f38a7b75267f ("KVM: X86: support paravirtualized help for TLB
> shootdowns").  But I don't think it would be a real world problem that
> time since the local CPU's tlb is flushed at first in guest before queuing
> KVM_VCPU_FLUSH_TLB to other CPUs.  It means that the hypervisor syncs the
> shadow pagetable before seeing the corresponding KVM_VCPU_FLUSH_TLBs.
> 
> After commit 4ce94eabac16 ("x86/mm/tlb: Flush remote and local TLBs
> concurrently"), the guest doesn't flush local CPU's tlb at first and
> the hypervisor can handle other VCPU's KVM_VCPU_FLUSH_TLB earlier than
> local VCPU's tlb flush and might flush the hardware tlb without syncing
> the shadow pagetable beforehand.

Maybe reword the last two paragraphs to make it clear that a change in the Linux
kernel exposed the KVM bug?

  This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB, but
  was only recently exposed after Linux guests stopped flushing the local
  CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16,
  "x86/mm/tlb: Flush remote and local TLBs concurrently").

> Cc: Maxim Levitsky <mlevitsk@redhat.com>
> Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns")
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
> Changed from V1
> 	Use kvm_mmu_unload() instead of KVM_REQ_MMU_RELOAD to avoid
> 	causing unneeded iteration of vcpu_enter_guest().
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index bbc4e04e67ad..27248e330767 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3072,6 +3072,22 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
>  static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>  {
>  	++vcpu->stat.tlb_flush;
> +
> +	if (!tdp_enabled) {
> +		/*
> +		 * When two dimensional paging is not enabled, the
> +		 * operation should equal to native_flush_tlb_global()
> +		 * or invpcid_flush_all() on the guest's behalf via
> +		 * synchronzing shadow pagetable and flushing.
> +		 *
> +		 * kvm_mmu_unload() results consequent kvm_mmu_load()
> +		 * before entering guest which will do the required
> +		 * pagetable synchronzing and TLB flushing.
> +		 */
> +		kvm_mmu_unload(vcpu);

I don't like the full unload, but I suppose the big hammer does make sense for a
backport since handle_invcpid() and toggling CR4.PGE already nuke everything.  :-/

> +		return;
> +	}
> +
>  	static_call(kvm_x86_tlb_flush_guest)(vcpu);
>  }
>  
> -- 
> 2.19.1.6.gb485710b
> 

  reply	other threads:[~2021-06-02 15:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27  2:39 [PATCH] KVM: X86: fix tlb_flush_guest() Lai Jiangshan
2021-05-27 12:55 ` Paolo Bonzini
2021-05-27 16:13   ` Sean Christopherson
2021-05-27 16:14     ` Sean Christopherson
2021-05-27 19:28       ` Sean Christopherson
2021-05-28  1:13         ` Lai Jiangshan
2021-06-02 15:09           ` Sean Christopherson
2021-06-02 22:07             ` Sean Christopherson
2021-05-28  0:18     ` Lai Jiangshan
2021-05-28  0:26       ` Sean Christopherson
2021-05-28  1:29         ` Lai Jiangshan
2021-06-02 15:01           ` Sean Christopherson
2021-06-02  8:13         ` Lai Jiangshan
2021-05-29 22:12     ` Maxim Levitsky
2021-05-31 17:22       ` [PATCH V2] " Lai Jiangshan
2021-06-02 15:39         ` Sean Christopherson [this message]
2021-06-07 22:38         ` Maxim Levitsky
2021-06-08  0:03           ` Sean Christopherson
2021-06-08 14:01             ` Lai Jiangshan
2021-06-08 17:36             ` Paolo Bonzini
2021-06-08 21:31             ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLemQ++Xdvh5TVNe@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).