All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	open list <linux-kernel@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Wanpeng Li <wanpengli@tencent.com>,
	Ingo Molnar <mingo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>, Joerg Roedel <joro@8bytes.org>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, Jim Mattson <jmattson@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH v3 8/8] KVM: x86: avoid loading PDPTRs after migration when possible
Date: Mon, 21 Jun 2021 01:25:18 +0300	[thread overview]
Message-ID: <7df87b7f0b2e029b483d08611e70291aab4e4d0b.camel@redhat.com> (raw)
In-Reply-To: <YM0H3Hvs8/3+twnc@google.com>

On Fri, 2021-06-18 at 20:53 +0000, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 11260e83518f..eadfc9caf500 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -815,6 +815,8 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3)
> >  
> >  	memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs));
> >  	kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
> > +	vcpu->arch.pdptrs_restored_oob = false;
> > +
> >  out:
> >  
> >  	return ret;
> > @@ -10113,6 +10115,7 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2)
> >  
> >  		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
> >  		mmu_reset_needed = 1;
> > +		vcpu->arch.pdptrs_restored_oob = true;
> 
> Setting pdptrs_restored_oob[*] here and _only_ clearing it on successful
> load_pdptrs() is not robust.  Potential problems once the flag is set:

Hi Sean Christopherson!
Thanks for the review!

I also thought about the exact same thing when I submitted the last version of
this patches (prior version didn't clear the flag at all but I noticed that
while doing self review of my patches).


> 
>   1.  Userspace calls KVM_SET_SREGS{,2} without valid PDPTRs.  Flag is now stale.

True. It isn't that big issue though since the only way to this is to also disable PAE mode
in CR4 during the call or before it, thus PDPTRs becomes irrelevant.
Once PAE is enabled again, PDPTRs will be loaded again, resetting this flag.

Something to note is that we also don't clear available/dirty status of VCPU_EXREG_PDPTR
in this case.



>   2.  kvm_check_nested_events() VM-Exits to L1 before the flag is processed.
>       Flag is now stale.

Also true. However this means that we enter L1 now, and once we are ready to enter
L2 again, we will load PDPTRS from guest memory as thankfully VM entries in PAE mode
do load PDPTRS from guest memory (both on Intel and AMD).



> 
> (2) might not be problematic in practice since the "normal" load_pdptrs()
> should reset the flag on the next VM-Enter, but it's really, really hard to tell.
> E.g. what if an SMI causes an exit and _that_ non-VM-Enter reload of L2 state
> is the first to trip the flag?  The bool is essentially an extension of
> KVM_REQ_GET_NESTED_STATE_PAGES, I think it makes sense to clear the flag whenever
> KVM_REQ_GET_NESTED_STATE_PAGES is cleared.

Could you expalain a bit better about SMM case? When SMM entry is done, at least Intel
spec is silent on if PDPTRs are preserved in SMRAM (and it doesn't have any place allocated
for them).

We currently don't preserve PDTPRS on SMM entry and we reload them via CR3/CR0 write 
when we exit SMM.

Ah I see now, on VMX the non VM-Enter code path is also used for returns from SMM,
and it sets the KVM_REQ_GET_NESTED_STATE_PAGES, so this is a real issue.
If SMM code enabled PAE, and then KVM_SET_SREGS2 was used, and followed by
RSM, we indeed have a risk of not loading the PDPTRs.

I think that this is best fixed by resetting this flag in vmx_leave_smm,
since RSM loads PDPTRS from memory always otherwise.

I also note that we don't do KVM_REQ_GET_NESTED_STATE_PAGES on SVM,
on return from SMM at all,
thus on SVM I think I broke the resume from SMM to a guest if the guest is PAE.
Oh well....

I will now extend the testing I usually do to SMM and prepare a patch to fix this.

> 
> Another thing that's not obvious is the required ordering between KVM_SET_SREGS2
> and KVM_SET_NESTED_STATE.  AFAICT it's not documented, but that may be PEBKAC on
> my end.  E.g. what happens if walk_mmu == &root_mmu (L1 active in targte KVM)
> when SET_SREGS2 is called, and _then_ KVM_SET_NESTED_STATE is called?

Isn't that exactly the current ordering (and reason why I had to do this patch series)
First the KVM_SET_SREGS is called indeed prior to KVM_SET_NESTED_STATE and it can
potentially load wrong PDPTRS, and then KVM_SET_NESTED_STATE is called which used
to 'fix' this by reloading them always.



> 
> [*] pdptrs_from_userspace in Paolo's tree.
> 


I think that strictly speaking this flag should be cleared when PAE mode is disabled,
and together with clearing of availablity of VCPU_EXREG_PDPTR.

I don't agree that this flag is an extension of the KVM_REQ_GET_NESTED_STATE_PAGES.
I think this flag is more like an extra property of VCPU_EXREG_PDPTR.
In addition to being dirty/available, this "register" can be loaded from memory
or restored from migration stream.

Thanks again for the review,
Best regards,
	Maxim Levitsky


  parent reply	other threads:[~2021-06-20 22:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07  9:01 [PATCH v3 0/8] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
2021-06-07  9:01 ` [PATCH v3 1/8] KVM: nVMX: Drop obsolete (and pointless) pdptrs_changed() check Maxim Levitsky
2021-06-07  9:01 ` [PATCH v3 2/8] KVM: nSVM: Drop pointless pdptrs_changed() check on nested transition Maxim Levitsky
2021-06-07  9:01 ` [PATCH v3 3/8] KVM: x86: Always load PDPTRs on CR3 load for SVM w/o NPT and a PAE guest Maxim Levitsky
2021-06-07  9:01 ` [PATCH v3 4/8] KVM: nSVM: refactor the CR3 reload on migration Maxim Levitsky
2021-06-07  9:02 ` [PATCH v3 5/8] KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES Maxim Levitsky
2021-06-07  9:02 ` [PATCH v3 6/8] KVM: x86: introduce kvm_register_clear_available Maxim Levitsky
2021-06-07  9:02 ` [PATCH v3 7/8] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 Maxim Levitsky
2021-06-10 15:06   ` Paolo Bonzini
2021-06-07  9:02 ` [PATCH v3 8/8] KVM: x86: avoid loading PDPTRs after migration when possible Maxim Levitsky
2021-06-18 20:53   ` Sean Christopherson
2021-06-19  7:03     ` Paolo Bonzini
2021-06-20 22:25     ` Maxim Levitsky [this message]
2021-06-10 15:06 ` [PATCH v3 0/8] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7df87b7f0b2e029b483d08611e70291aab4e4d0b.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.