All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzbot+f1d2136db9c80d4733e8@syzkaller.appspotmail.com
Subject: Re: [PATCH 2/4] KVM: nVMX: Synthesize TRIPLE_FAULT for L2 if emulation is required
Date: Tue, 14 Dec 2021 11:12:32 +0200	[thread overview]
Message-ID: <1c66981541a52e8d7c2b72db2ecd1bcc79c16be6.camel@redhat.com> (raw)
In-Reply-To: <20211207193006.120997-3-seanjc@google.com>

On Tue, 2021-12-07 at 19:30 +0000, Sean Christopherson wrote:
> Synthesize a triple fault if L2 guest state is invalid at the time of
> VM-Enter, which can happen if L1 modifies SMRAM or if userspace stuffs
> guest state via ioctls(), e.g. KVM_SET_SREGS.  KVM should never emulate
> invalid guest state, since from L1's perspective, it's architecturally
> impossible for L2 to have invalid state while L2 is running in hardware.
> E.g. attempts to set CR0 or CR4 to unsupported values will either VM-Exit
> or #GP.
> 
> Modifying vCPU state via RSM+SMRAM and ioctl() are the only paths that
> can trigger this scenario, as nested VM-Enter correctly rejects any
> attempt to enter L2 with invalid state.
> 
> RSM is a straightforward case as (a) KVM follows AMD's SMRAM layout and
> behavior, and (b) Intel's SDM states that loading reserved CR0/CR4 bits
> via RSM results in shutdown, i.e. there is precedent for KVM's behavior.
> Following AMD's SMRAM layout is important as AMD's layout saves/restores
> the descriptor cache information, including CS.RPL and SS.RPL, and also
> defines all the fields relevant to invalid guest state as read-only, i.e.
> so long as the vCPU had valid state before the SMI, which is guaranteed
> for L2, RSM will generate valid state unless SMRAM was modified.  Intel's
> layout saves/restores only the selector, which means that scenarios where
> the selector and cached RPL don't match, e.g. conforming code segments,
> would yield invalid guest state.  Intel CPUs fudge around this issued by
> stuffing SS.RPL and CS.RPL on RSM.  Per Intel's SDM on the "Default
> Treatment of RSM", paraphrasing for brevity:
> 
>   IF internal storage indicates that the [CPU was post-VMXON]
>   THEN
>      enter VMX operation (root or non-root);
>      restore VMX-critical state as defined in Section 34.14.1;
>      set to their fixed values any bits in CR0 and CR4 whose values must
>      be fixed in VMX operation [unless coming from an unrestricted guest];
>      IF RFLAGS.VM = 0 AND (in VMX root operation OR the
>         “unrestricted guest” VM-execution control is 0)
>      THEN
>        CS.RPL := SS.DPL;
>        SS.RPL := SS.DPL;
>      FI;
>      restore current VMCS pointer;
>   FI;
> 
> Note that Intel CPUs also overwrite the fixed CR0/CR4 bits, whereas KVM
> will sythesize TRIPLE_FAULT in this scenario.  KVM's behavior is allowed
> as both Intel and AMD define CR0/CR4 SMRAM fields as read-only, i.e. the
> only way for CR0 and/or CR4 to have illegal values is if they were
> modified by the L1 SMM handler, and Intel's SDM "SMRAM State Save Map"
> section states "modifying these registers will result in unpredictable
> behavior".
> 
> KVM's ioctl() behavior is less straightforward.  Because KVM allows
> ioctls() to be executed in any order, rejecting an ioctl() if it would
> result in invalid L2 guest state is not an option as KVM cannot know if
> a future ioctl() would resolve the invalid state, e.g. KVM_SET_SREGS, or
> drop the vCPU out of L2, e.g. KVM_SET_NESTED_STATE.  Ideally, KVM would
> reject KVM_RUN if L2 contained invalid guest state, but that carries the
> risk of a false positive, e.g. if RSM loaded invalid guest state and KVM
> exited to userspace.  Setting a flag/request to detect such a scenario is
> undesirable because (a) it's extremely unlikely to add value to KVM as a
> whole, and (b) KVM would need to consider ioctl() interactions with such
> a flag, e.g. if userspace migrated the vCPU while the flag were set.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 32 ++++++++++++++++++++++++--------
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9e415e5a91ab..5307fcee3b3b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5900,18 +5900,14 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  		vmx_flush_pml_buffer(vcpu);
>  
>  	/*
> -	 * We should never reach this point with a pending nested VM-Enter, and
> -	 * more specifically emulation of L2 due to invalid guest state (see
> -	 * below) should never happen as that means we incorrectly allowed a
> -	 * nested VM-Enter with an invalid vmcs12.
> +	 * KVM should never reach this point with a pending nested VM-Enter.
> +	 * More specifically, short-circuiting VM-Entry to emulate L2 due to
> +	 * invalid guest state should never happen as that means KVM knowingly
> +	 * allowed a nested VM-Enter with an invalid vmcs12.  More below.
>  	 */
>  	if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
>  		return -EIO;
>  
> -	/* If guest state is invalid, start emulating */
> -	if (vmx->emulation_required)
> -		return handle_invalid_guest_state(vcpu);
> -
>  	if (is_guest_mode(vcpu)) {
>  		/*
>  		 * PML is never enabled when running L2, bail immediately if a
> @@ -5933,10 +5929,30 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  		 */
>  		nested_mark_vmcs12_pages_dirty(vcpu);
>  
> +		/*
> +		 * Synthesize a triple fault if L2 state is invalid.  In normal
> +		 * operation, nested VM-Enter rejects any attempt to enter L2
> +		 * with invalid state.  However, those checks are skipped if
> +		 * state is being stuffed via RSM or KVM_SET_NESTED_STATE.  If
> +		 * L2 state is invalid, it means either L1 modified SMRAM state
> +		 * or userspace provided bad state.  Synthesize TRIPLE_FAULT as
> +		 * doing so is architecturally allowed in the RSM case, and is
> +		 * the least awful solution for the userspace case without
> +		 * risking false positives.
> +		 */
> +		if (vmx->emulation_required) {
> +			nested_vmx_vmexit(vcpu, EXIT_REASON_TRIPLE_FAULT, 0, 0);
> +			return 1;
> +		}
> +
>  		if (nested_vmx_reflect_vmexit(vcpu))
>  			return 1;
>  	}
>  
> +	/* If guest state is invalid, start emulating.  L2 is handled above. */
> +	if (vmx->emulation_required)
> +		return handle_invalid_guest_state(vcpu);
> +
>  	if (exit_reason.failed_vmentry) {
>  		dump_vmcs(vcpu);
>  		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;



Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Best regards,
	Maxim Levitsky


  reply	other threads:[~2021-12-14  9:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07 19:30 [PATCH 0/4] KVM: VMX: Fix handling of invalid L2 guest state Sean Christopherson
2021-12-07 19:30 ` [PATCH 1/4] KVM: VMX: Always clear vmx->fail on emulation_required Sean Christopherson
2021-12-14  9:12   ` Maxim Levitsky
2021-12-07 19:30 ` [PATCH 2/4] KVM: nVMX: Synthesize TRIPLE_FAULT for L2 if emulation is required Sean Christopherson
2021-12-14  9:12   ` Maxim Levitsky [this message]
2021-12-07 19:30 ` [PATCH 3/4] KVM: VMX: Fix stale docs for kvm-intel.emulate_invalid_guest_state Sean Christopherson
2021-12-14  9:12   ` Maxim Levitsky
2021-12-07 19:30 ` [PATCH 4/4] KVM: selftests: Add test to verify TRIPLE_FAULT on invalid L2 guest state Sean Christopherson
2021-12-14  9:15   ` Maxim Levitsky
2021-12-09 11:26 ` [PATCH 0/4] KVM: VMX: Fix handling of " Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c66981541a52e8d7c2b72db2ecd1bcc79c16be6.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=syzbot+f1d2136db9c80d4733e8@syzkaller.appspotmail.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.