From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andy Lutomirski <luto@kernel.org>
Subject: Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's
 IA32_SGXLEPUBKEYHASHn runtime switch support
Date: Thu, 11 May 2017 23:11:59 -0700
Message-ID: <CALCETrUcYfb8E2Ot=WhPH79bVNoPxyBX1ot02o_QvxVsLsQnMg@mail.gmail.com>
References: <20170508052434.3627-1-kai.huang@linux.intel.com>
 <20170508052434.3627-9-kai.huang@linux.intel.com> <58dcdb2d-6894-b0a3-8d6f-2ab752fd6d22@linux.intel.com>
 <CALCETrXG6KZEuxzDRwrAC5Wp=tCFaUBt4oPMQE40gdYB4p_A_Q@mail.gmail.com> <6ab7ec4e-e0fa-af47-11b2-f26edcb088fb@linux.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Cc: Andy Lutomirski <luto@kernel.org>,
        Kai Huang <kaih.linux@gmail.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Radim Krcmar <rkrcmar@redhat.com>,
        kvm list <kvm@vger.kernel.org>,
        "intel-sgx-kernel-dev@lists.01.org"
        <intel-sgx-kernel-dev@lists.01.org>, haim.cohen@intel.com
To: "Huang, Kai" <kai.huang@linux.intel.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail.kernel.org ([198.145.29.99]:59194 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751266AbdELGMW (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 12 May 2017 02:12:22 -0400
Received: from mail-ua0-f172.google.com (mail-ua0-f172.google.com [209.85.217.172])
        (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
        (No client certificate requested)
        by mail.kernel.org (Postfix) with ESMTPSA id 3B6FE23993
        for <kvm@vger.kernel.org>; Fri, 12 May 2017 06:12:21 +0000 (UTC)
Received: by mail-ua0-f172.google.com with SMTP id j17so41227434uag.3
        for <kvm@vger.kernel.org>; Thu, 11 May 2017 23:12:21 -0700 (PDT)
In-Reply-To: <6ab7ec4e-e0fa-af47-11b2-f26edcb088fb@linux.intel.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Thu, May 11, 2017 at 9:56 PM, Huang, Kai <kai.huang@linux.intel.com> wrote:
> I am not sure whether the cost of writing to 4 MSRs would be *extremely*
> slow, as when vcpu is schedule in, KVM is already doing vmcs_load, writing
> to several MSRs, etc.

I'm speculating that these MSRs may be rather unoptimized and hence
unusualy slow.

>
>>
>> Have a percpu variable that stores the current SGXLEPUBKEYHASH along
>> with whatever lock is needed (probably just a mutex).  Users of EINIT
>> will take the mutex, compare the percpu variable to the desired value,
>> and, if it's different, do WRMSR and update the percpu variable.
>>
>> KVM will implement writes to SGXLEPUBKEYHASH by updating its in-memory
>> state but *not* changing the MSRs.  KVM will trap and emulate EINIT to
>> support the same handling as the host.  There is no action required at
>> all on KVM guest entry and exit.
>
>
> This is doable, but SGX driver needs to do those things and expose
> interfaces for KVM to use. In terms of the percpu data, it is nice to have,
> but I am not sure whether it is mandatory, as IMO EINIT is not even in
> performance critical path. We can simply read old value from MSRs out and
> compare whether the old equals to the new.

I think the SGX driver should probably live in arch/x86, and the
interface could be a simple percpu variable that is exported (from the
main kernel image, not from a module).

>
>>
>> FWIW, I think that KVM will, in the long run, want to trap EINIT for
>> other reasons: someone is going to want to implement policy for what
>> enclaves are allowed that applies to guests as well as the host.
>
>
> I am not very convinced why "what enclaves are allowed" in host would apply
> to guest. Can you elaborate? I mean in general virtualization just focus
> emulating hardware behavior. If a native machine is able to run any LE, the
> virtual machine should be able to as well (of course, with guest's
> IA32_FEATURE_CONTROL[bit 17] set).

I strongly disagree.  I can imagine two classes of sensible policies
for launch control:

1. Allow everything.  This seems quite sensible to me.

2. Allow some things, and make sure that VMs have at least as
restrictive a policy as host root has.  After all, what's the point of
restricting enclaves in the host if host code can simply spawn a
little VM to run otherwise-disallowed enclaves?

>
>> Also, some day Intel may fix its architectural design flaw [1] by
>> allowing EINIT to personalize the enclave's keying, and, if it's done
>> by a new argument to EINIT instead of an MSR, KVM will have to trap
>> EINIT to handle it.
>
>
> Looks this flaw is not the same issue as above (host enclave policy applies
> to guest)?

It's related.  Without this flaw, it might make sense to apply looser
policy in the guest as in the host.  With this flaw, I think your
policy fails to have any real effect if you don't enforce it on
guests.

>
>>
>>>
>>> One argument against this approach is KVM guest should never have impact
>>> on
>>> host side, meaning host should not be aware of such MSR change
>>
>>
>> As a somewhat generic comment, I don't like this approach to KVM
>> development.  KVM mucks with lots of important architectural control
>> registers, and, in all too many cases, it tries to do so independently
>> of the other arch/x86 code.  This ends up causing all kinds of grief.
>>
>> Can't KVM and the real x86 arch code cooperate for real?  The host and
>> the KVM code are in arch/x86 in the same source tree.
>
>
> Currently on host SGX driver, which is pretty much self-contained,
> implements all SGX related staff.

I will probably NAK this if it comes my way for inclusion upstream.
Just because it can be self-contained doesn't mean it should be
self-contained.

>>
>> I would advocate for the former approach.  (But you can't remap the
>> parameters due to TOCTOU issues, locking, etc.  Just copy them.  I
>> don't see why this is any more complicated than emulating any other
>> instruction that accesses memory.)
>
>
> No you cannot just copy. Because all address in guest's ENCLS parameters are
> guest's virtual address, we cannot use them to execute ENCLS in KVM. If any
> guest virtual addresses is used in ENCLS parameters, for example,
> PAGEINFO.SECS, PAGEINFO.SECINFO/PCMD, etc, you have to remap them to KVM's
> virtual address.
>
> Btw, what is TOCTOU issue? would you also elaborate locking issue?

I was partially mis-remembering how this worked.  It looks like
SIGSTRUCT and EINITTOKEN could be copied but SECS would have to be
mapped.  If KVM applied some policy to the launchable enclaves, it
would want to make sure that it only looks at fields that are copied
to make sure that the enclave that gets launched is the one it
verified.  The locking issue I'm imagining is that the SECS (or
whatever else might be mapped) doesn't disappear and get reused for
something else while it's mapped in the host.  Presumably KVM has an
existing mechanism for this, but maybe SECS is special because it's
not quite normal memory IIRC.

>
>>
>> If necessary for some reason, trap EINIT when the SGXLEPUBKEYKASH is
>> wrong and then clear the exit flag once the MSRs are in sync.  You'll
>> need to be careful to avoid races in which the host's value leaks into
>> the guest.  I think you'll find that this is more complicated, less
>> flexible, and less performant than just handling ENCLS[EINIT] directly
>> in the host.
>
>
> Sorry I don't quite follow this part. Why would host's value leaks into
> guest? I suppose the *value* means host's IA32_SGXLEPUBKEYHASHn? guest's MSR
> read/write is always trapped and emulated by KVM.

You'd need to make sure that this sequence of events doesn't happen:

 - Guest does EINIT and it exits.
 - Host updates the MSRs and the ENCLS-exiting bitmap.
 - Guest is preempted before it retries EINIT.
 - A different host thread launches an enclave, thus changing the MSRs.
 - Guest resumes and runs EINIT without exiting with the wrong MSR values.

>
>>
>> [1] Guests that steal sealed data from each other or from the host can
>> manipulate that data without compromising the hypervisor by simply
>> loading the same enclave that its rightful owner would use.  If you're
>> trying to use SGX to protect your crypto credentials so that, if
>> stolen, they can't be used outside the guest, I would consider this to
>> be a major flaw.  It breaks the security model in a multi-tenant cloud
>> situation.  I've complained about it before.
>>
>
> Looks potentially only guest's IA32_SGXLEPUBKEYHASHn may be leaked? In this
> case even it is leaked looks we cannot dig anything out just the hash value?

Not sure what you mean.  Are you asking about the lack of guest personalization?

Concretely, imagine I write an enclave that seals my TLS client
certificate's private key and offers an API to sign TLS certificate
requests with it.  This way, if my system is compromised, an attacker
can use the certificate only so long as they have access to my
machine.  If I kick them out or if they merely get the ability to read
the sealed data but not to execute code, the private key should still
be safe.  But, if this system is a VM guest, the attacker could run
the exact same enclave on another guest on the same physical CPU and
sign using my key.  Whoops!