kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Lendacky <thomas.lendacky@amd.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Gonda <pgonda@google.com>,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: Re: [PATCH 2/2] KVM: x86: Allow userspace to update tracked sregs for protected guests
Date: Mon, 10 May 2021 16:23:37 -0500	[thread overview]
Message-ID: <26da40a0-c9b4-f517-94a6-5d3d69c4a207@amd.com> (raw)
In-Reply-To: <YJmfV1sO8miqvQLM@google.com>

On 5/10/21 4:02 PM, Sean Christopherson wrote:
> On Mon, May 10, 2021, Tom Lendacky wrote:
>> On 5/10/21 11:10 AM, Sean Christopherson wrote:
>>> On Fri, May 07, 2021, Tom Lendacky wrote:
>>>> On 5/7/21 11:59 AM, Sean Christopherson wrote:
>>>>> Allow userspace to set CR0, CR4, CR8, and EFER via KVM_SET_SREGS for
>>>>> protected guests, e.g. for SEV-ES guests with an encrypted VMSA.  KVM
>>>>> tracks the aforementioned registers by trapping guest writes, and also
>>>>> exposes the values to userspace via KVM_GET_SREGS.  Skipping the regs
>>>>> in KVM_SET_SREGS prevents userspace from updating KVM's CPU model to
>>>>> match the known hardware state.
>>>> This is very similar to the original patch I had proposed that you were
>>>> against :)
>>> I hope/think my position was that it should be unnecessary for KVM to need to
>>> know the guest's CR0/4/0 and EFER values, i.e. even the trapping is unnecessary.
>>> I was going to say I had a change of heart, as EFER.LMA in particular could
>>> still be required to identify 64-bit mode, but that's wrong; EFER.LMA only gets
>>> us long mode, the full is_64_bit_mode() needs access to cs.L, which AFAICT isn't
>>> provided by #VMGEXIT or trapping.
>> Right, that one is missing. If you take a VMGEXIT that uses the GHCB, then
>> I think you can assume we're in 64-bit mode.
> But that's not technically guaranteed.  The GHCB even seems to imply that there
> are scenarios where it's legal/expected to do VMGEXIT with a valid GHCB outside
> of 64-bit mode:
>   However, instead of issuing a HLT instruction, the AP will issue a VMGEXIT
>   with SW_EXITCODE of 0x8000_0004 ((this implies that the GHCB was updated prior
>   to leaving 64-bit long mode).

Right, but in order to fill in the GHCB so that the hypervisor can read
it, the guest had to have been in 64-bit mode. Otherwise, whatever the
guest wrote will be seen as encrypted data and make no sense to the
hypervisor anyway.

> In practice, assuming the guest is in 64-bit mode will likely work, especially
> since the MSR-based protocol is extremely limited, but ideally there should be
> stronger language in the GHCB to define the exact VMM assumptions/behaviors.
> On the flip side, that assumption and the limited exposure through the MSR
> protocol means trapping CR0, CR4, and EFER is pointless.  I don't see how KVM
> can do anything useful with that information outside of VMGEXITs.  Page tables
> are encrypted and GPRs are stale; what else could KVM possibly do with
> identifying protected mode, paging, and/or 64-bit?
>>> Unless I'm missing something, that means that VMGEXIT(VMMCALL) is broken since
>>> KVM will incorrectly crush (or preserve) bits 63:32 of GPRs.  I'm guessing no
>>> one has reported a bug because either (a) no one has tested a hypercall that
>>> requires bits 63:32 in a GPR or (b) the guest just happens to be in 64-bit mode
>>> when KVM_SEV_LAUNCH_UPDATE_VMSA is invoked and so the segment registers are
>>> frozen to make it appear as if the guest is perpetually in 64-bit mode.
>> I don't think it's (b) since the LAUNCH_UPDATE_VMSA is done against reset-
>> state vCPUs.
>>> I see that sev_es_validate_vmgexit() checks ghcb_cpl_is_valid(), but isn't that
>>> either pointless or indicative of a much, much bigger problem?  If VMGEXIT is
>> It is needed for the VMMCALL exit.
>>> restricted to CPL0, then the check is pointless.  If VMGEXIT isn't restricted to
>>> CPL0, then KVM has a big gaping hole that allows a malicious/broken guest
>>> userspace to crash the VM simply by executing VMGEXIT.  Since valid_bitmap is
>>> cleared during VMGEXIT handling, I don't think guest userspace can attack/corrupt
>>> the guest kernel by doing a replay attack, but it does all but guarantee a
>>> VMGEXIT at CPL>0 will be fatal since the required valid bits won't be set.
>> Right, so I think some cleanup is needed there, both for the guest and the
>> hypervisor:
>> - For the guest, we could just clear the valid bitmask before leaving the
>>   #VC handler/releasing the GHCB. Userspace can't update the GHCB, so any
>>   VMGEXIT from userspace would just look like a no-op with the below
>>   change to KVM.
> Ah, right, the exit_code and exit infos need to be valid.
>> - For KVM, instead of returning -EINVAL from sev_es_validate_vmgexit(), we
>>   return the #GP action through the GHCB and continue running the guest.
> Agreed, KVM should never kill the guest in response to a bad VMGEXIT.  That
> should always be a guest decision.
>>> Sadly, the APM doesn't describe the VMGEXIT behavior, nor does any of the SEV-ES
>>> documentation I have.  I assume VMGEXIT is recognized at CPL>0 since it morphs
>>> to VMMCALL when SEV-ES isn't active.
>> Correct.
>>> I.e. either the ghcb_cpl_is_valid() check should be nuked, or more likely KVM
>> The ghcb_cpl_is_valid() is still needed to see whether the VMMCALL was
>> from userspace or not (a VMMCALL will generate a #VC).
> Blech.  I get that the GHCB spec says CPL must be provided/checked for VMMCALL,
> but IMO that makes no sense whatsover.
> If the guest restricts the GHCB to CPL0, then the CPL field is pointless because
> the VMGEXIT will only ever come from CPL0.  Yes, technically the guest kernel
> can proxy a VMMCALL from userspace to the host, but the guest kernel _must_ be
> the one to enforce any desired CPL checks because the VMM is untrusted, at least
> once you get to SNP.
> If the guest exposes the GHCB to any CPL, then the CPL check is worthless because

The GHCB itself is not exposed to any CPL. A VMMCALL will generate a #VC.
The guest #VC handler will extract the CPL level from the context that
generated the #VC (see vc_handle_vmmcall() in arch/x86/kernel/sev-es.c),
so that a VMMCALL from userspace will have the proper CPL value in the
GHCB when the #VC handler issues the VMGEXIT instruction.


> guest userspace can simply lie about the CPL.  And exposing the GCHB to userspace
> completely undermines guest privilege separation since hardware doesn't provide
> the real CPL, i.e. the VMM, even it were trusted, can't determine the origin of
> the VMGEXIT.

  reply	other threads:[~2021-05-10 21:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 16:59 [PATCH 0/2] KVM: x86: Fixes for SEV-ES state tracking Sean Christopherson
2021-05-07 16:59 ` [PATCH 1/2] KVM: SVM: Update EFER software model on CR0 trap for SEV-ES Sean Christopherson
2021-05-07 23:15   ` Tom Lendacky
2021-05-07 16:59 ` [PATCH 2/2] KVM: x86: Allow userspace to update tracked sregs for protected guests Sean Christopherson
2021-05-07 23:21   ` Tom Lendacky
2021-05-10 16:10     ` Sean Christopherson
2021-05-10 18:07       ` Tom Lendacky
2021-05-10 21:02         ` Sean Christopherson
2021-05-10 21:23           ` Tom Lendacky [this message]
2021-05-10 22:40             ` Sean Christopherson
2021-05-14 14:19       ` Peter Gonda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26da40a0-c9b4-f517-94a6-5d3d69c4a207@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).