linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Ben Gardon <bgardon@google.com>,
	Junaid Shahid <junaids@google.com>,
	Liran Alon <liran.alon@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	John Haxby <john.haxby@oracle.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [PATCH v3 23/37] KVM: nVMX: Add helper to handle TLB flushes on nested VM-Enter/VM-Exit
Date: Sat, 30 Oct 2021 09:34:31 +0800	[thread overview]
Message-ID: <CAJhGHyBNazRUiwPT5nGC=JNYX96J1dP9Y+KWFz7TeYuT3teYZg@mail.gmail.com> (raw)
In-Reply-To: <YXwq+Q3+I81jwv7G@google.com>

/

On Sat, Oct 30, 2021 at 1:10 AM Sean Christopherson <seanjc@google.com> wrote:
>
> TL;DR: I'll work on a proper series next week, there are multiple things that need
> to be fixed.
>
> On Fri, Oct 29, 2021, Lai Jiangshan wrote:
> > On Thu, Oct 28, 2021 at 11:22 PM Sean Christopherson <seanjc@google.com> wrote:
> > > The fix should simply be:
> > >
> > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > index eedcebf58004..574823370e7a 100644
> > > --- a/arch/x86/kvm/vmx/nested.c
> > > +++ b/arch/x86/kvm/vmx/nested.c
> > > @@ -1202,17 +1202,15 @@ static void nested_vmx_transition_tlb_flush(struct kvm_vcpu *vcpu,
> > >          *
> > >          * If a TLB flush isn't required due to any of the above, and vpid12 is
> > >          * changing then the new "virtual" VPID (vpid12) will reuse the same
> > > -        * "real" VPID (vpid02), and so needs to be flushed.  There's no direct
> > > -        * mapping between vpid02 and vpid12, vpid02 is per-vCPU and reused for
> > > -        * all nested vCPUs.  Remember, a flush on VM-Enter does not invalidate
> > > -        * guest-physical mappings, so there is no need to sync the nEPT MMU.
> > > +        * "real" VPID (vpid02), and so needs to be flushed.  Like the !vpid02
> > > +        * case above, this is a full TLB flush from the guest's perspective.
> > >          */
> > >         if (!nested_has_guest_tlb_tag(vcpu)) {
> > >                 kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> > >         } else if (is_vmenter &&
> > >                    vmcs12->virtual_processor_id != vmx->nested.last_vpid) {
> > >                 vmx->nested.last_vpid = vmcs12->virtual_processor_id;
> > > -               vpid_sync_context(nested_get_vpid02(vcpu));
> > > +               kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
> >
> > This change is neat.
>
> Heh, yeah, but too neat to be right :-)
>
> > But current KVM_REQ_TLB_FLUSH_GUEST flushes vpid01 only, and it doesn't flush
> > vpid02.  vmx_flush_tlb_guest() might need to be changed to flush vpid02 too.
>
> Hmm.  I think vmx_flush_tlb_guest() is straight up broken.  E.g. if EPT is enabled
> but L1 doesn't use EPT for L2 and doesn't intercept INVPCID, then KVM will handle
> INVPCID from L2.  That means the recent addition to kvm_invalidate_pcid() (see
> below) will flush the wrong VPID.  And it's incorrect (well, more than is required
> by the SDM) to flush both VPIDs because flushes from INVPCID (and flushes from the
> guest's perspective in general) are scoped to the current VPID, e.g. a "full" TLB
> flush in the "host" by toggling CR4.PGE flushes only the current VPID:

I think KVM_REQ_TLB_FLUSH_GUEST/kvm_vcpu_flush_tlb_guest/vmx_flush_tlb_guest
was deliberately designed for the L1 guest only.  It can be seen from the code,
from the history, and from the caller's side.  For example,
nested_vmx_transition_tlb_flush() knows KVM_REQ_TLB_FLUSH_GUEST flushes
L1 guest:

        /*
         * If vmcs12 doesn't use VPID, L1 expects linear and combined mappings
         * for *all* contexts to be flushed on VM-Enter/VM-Exit, i.e. it's a
         * full TLB flush from the guest's perspective.  This is required even
         * if VPID is disabled in the host as KVM may need to synchronize the
         * MMU in response to the guest TLB flush.
         *
         * Note, using TLB_FLUSH_GUEST is correct even if nested EPT is in use.
         * EPT is a special snowflake, as guest-physical mappings aren't
         * flushed on VPID invalidations, including VM-Enter or VM-Exit with
         * VPID disabled.  As a result, KVM _never_ needs to sync nEPT
         * entries on VM-Enter because L1 can't rely on VM-Enter to flush
         * those mappings.
         */
        if (!nested_cpu_has_vpid(vmcs12)) {
                kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
                return;
        }

While handle_invvpid() doesn't use KVM_REQ_TLB_FLUSH_GUEST.

So, I don't think KVM_REQ_TLB_FLUSH_GUEST, kvm_vcpu_flush_tlb_guest
or vmx_flush_tlb_guest is broken since they are for L1 guests.
What we have to do is to consider is it worth extending them for
nested guests for the convenience of nested code.

I second that they are extended.

A small comment in your proposal: I found that KVM_REQ_TLB_FLUSH_CURRENT
and KVM_REQ_TLB_FLUSH_GUEST is to flush "current" vpid only, some special
work needs to be added when switching mmu from L1 to L2 and vice versa:
handle the requests before switching.

  reply	other threads:[~2021-10-30  1:34 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 21:27 [PATCH v3 00/37] KVM: x86: TLB flushing fixes and enhancements Sean Christopherson
2020-03-20 21:27 ` [PATCH v3 01/37] KVM: VMX: Flush all EPTP/VPID contexts on remote TLB flush Sean Christopherson
2021-08-03  1:45   ` Lai Jiangshan
2021-08-03 15:39     ` Sean Christopherson
2021-08-04  3:11       ` Lai Jiangshan
2021-08-04 15:33         ` Sean Christopherson
2020-03-20 21:27 ` [PATCH v3 02/37] KVM: nVMX: Validate the EPTP when emulating INVEPT(EXTENT_CONTEXT) Sean Christopherson
2020-03-23 14:51   ` Vitaly Kuznetsov
2020-03-23 15:45     ` Sean Christopherson
2020-03-23 23:46       ` Paolo Bonzini
2020-03-20 21:27 ` [PATCH v3 03/37] KVM: nVMX: Invalidate all EPTP contexts when emulating INVEPT for L1 Sean Christopherson
2020-03-23 15:24   ` Vitaly Kuznetsov
2020-03-23 15:53     ` Sean Christopherson
2020-03-23 16:24   ` Jim Mattson
2020-03-23 16:28     ` Sean Christopherson
2020-03-23 16:36       ` Jim Mattson
2020-03-23 16:44         ` Sean Christopherson
2020-03-23 23:50           ` Paolo Bonzini
2020-03-24  0:12             ` Jim Mattson
2020-03-30 18:38               ` Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 04/37] KVM: nVMX: Invalidate all roots when emulating INVVPID without EPT Sean Christopherson
2020-03-23 15:34   ` Vitaly Kuznetsov
2020-03-23 16:04     ` Sean Christopherson
2020-03-23 16:33       ` Vitaly Kuznetsov
2020-03-23 16:50         ` Sean Christopherson
2020-03-23 16:57           ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 05/37] KVM: x86: Export kvm_propagate_fault() (as kvm_inject_emulated_page_fault) Sean Christopherson
2020-03-23 15:47   ` Vitaly Kuznetsov
2020-03-23 16:24     ` Sean Christopherson
2020-03-23 23:56       ` Paolo Bonzini
2020-03-20 21:28 ` [PATCH v3 06/37] KVM: x86: Consolidate logic for injecting page faults to L1 Sean Christopherson
2020-03-24  0:47   ` Paolo Bonzini
2020-03-20 21:28 ` [PATCH v3 07/37] KVM: x86: Sync SPTEs when injecting page/EPT fault into L1 Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 08/37] KVM: VMX: Skip global INVVPID fallback if vpid==0 in vpid_sync_context() Sean Christopherson
2020-03-25  9:33   ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 09/37] KVM: VMX: Use vpid_sync_context() directly when possible Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 10/37] KVM: VMX: Move vpid_sync_vcpu_addr() down a few lines Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 11/37] KVM: VMX: Handle INVVPID fallback logic in vpid_sync_vcpu_addr() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 12/37] KVM: VMX: Drop redundant capability checks in low level INVVPID helpers Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 13/37] KVM: nVMX: Use vpid_sync_vcpu_addr() to emulate INVVPID with address Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 14/37] KVM: x86: Move "flush guest's TLB" logic to separate kvm_x86_ops hook Sean Christopherson
2020-03-25 10:23   ` Vitaly Kuznetsov
2020-03-25 15:41     ` Paolo Bonzini
2020-03-25 16:08       ` Vitaly Kuznetsov
2020-03-25 15:48     ` Sean Christopherson
2020-03-25 16:11       ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 15/37] KVM: VMX: Clean up vmx_flush_tlb_gva() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 16/37] KVM: x86: Drop @invalidate_gpa param from kvm_x86_ops' tlb_flush() Sean Christopherson
2020-03-25 11:23   ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 17/37] KVM: SVM: Wire up ->tlb_flush_guest() directly to svm_flush_tlb() Sean Christopherson
2020-03-25 11:23   ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 18/37] KVM: VMX: Move vmx_flush_tlb() to vmx.c Sean Christopherson
2020-03-25 11:25   ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 19/37] KVM: nVMX: Move nested_get_vpid02() to vmx/nested.h Sean Christopherson
2020-03-25 11:25   ` Vitaly Kuznetsov
2020-03-20 21:28 ` [PATCH v3 20/37] KVM: VMX: Introduce vmx_flush_tlb_current() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 21/37] KVM: SVM: Document the ASID logic in svm_flush_tlb() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 22/37] KVM: x86: Rename ->tlb_flush() to ->tlb_flush_all() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 23/37] KVM: nVMX: Add helper to handle TLB flushes on nested VM-Enter/VM-Exit Sean Christopherson
2021-10-28 13:11   ` Lai Jiangshan
2021-10-28 15:22     ` Sean Christopherson
2021-10-29  0:44       ` Lai Jiangshan
2021-10-29 17:10         ` Sean Christopherson
2021-10-30  1:34           ` Lai Jiangshan [this message]
2021-11-04 17:47             ` Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 24/37] KVM: x86: Introduce KVM_REQ_TLB_FLUSH_CURRENT to flush current ASID Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 25/37] KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 26/37] KVM: nVMX: Selectively use TLB_FLUSH_CURRENT for nested VM-Enter/VM-Exit Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 27/37] KVM: nVMX: Reload APIC access page on nested VM-Exit only if necessary Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 28/37] KVM: VMX: Retrieve APIC access page HPA only when necessary Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 29/37] KVM: VMX: Don't reload APIC access page if its control is disabled Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 30/37] KVM: x86/mmu: Move fast_cr3_switch() side effects to __kvm_mmu_new_cr3() Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 31/37] KVM: x86/mmu: Add separate override for MMU sync during fast CR3 switch Sean Christopherson
2020-03-24 11:07   ` Paolo Bonzini
2020-03-20 21:28 ` [PATCH v3 32/37] KVM: x86/mmu: Add module param to force TLB flush on root reuse Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 33/37] KVM: nVMX: Skip MMU sync on nested VMX transition when possible Sean Christopherson
2020-03-24 11:19   ` Paolo Bonzini
2020-03-20 21:28 ` [PATCH v3 34/37] KVM: nVMX: Don't flush TLB on nested VMX transition Sean Christopherson
2020-03-24 11:20   ` Paolo Bonzini
2020-03-24 18:10     ` Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 35/37] KVM: nVMX: Free only the affected contexts when emulating INVEPT Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 36/37] KVM: x86: Replace "cr3" with "pgd" in "new cr3/pgd" related code Sean Christopherson
2020-03-20 21:28 ` [PATCH v3 37/37] KVM: VMX: Clean cr3/pgd handling in vmx_load_mmu_pgd() Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJhGHyBNazRUiwPT5nGC=JNYX96J1dP9Y+KWFz7TeYuT3teYZg@mail.gmail.com' \
    --to=jiangshanlai+lkml@gmail.com \
    --cc=bgardon@google.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jmattson@google.com \
    --cc=john.haxby@oracle.com \
    --cc=joro@8bytes.org \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liran.alon@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).