From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andy Lutomirski <luto@kernel.org>
Subject: Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's
 IA32_SGXLEPUBKEYHASHn runtime switch support
Date: Thu, 11 May 2017 20:28:37 -0700
Message-ID: <CALCETrXG6KZEuxzDRwrAC5Wp=tCFaUBt4oPMQE40gdYB4p_A_Q@mail.gmail.com>
References: <20170508052434.3627-1-kai.huang@linux.intel.com>
 <20170508052434.3627-9-kai.huang@linux.intel.com> <58dcdb2d-6894-b0a3-8d6f-2ab752fd6d22@linux.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Cc: Kai Huang <kaih.linux@gmail.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Radim Krcmar <rkrcmar@redhat.com>,
        kvm list <kvm@vger.kernel.org>,
        "intel-sgx-kernel-dev@lists.01.org"
        <intel-sgx-kernel-dev@lists.01.org>, haim.cohen@intel.com
To: "Huang, Kai" <kai.huang@linux.intel.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail.kernel.org ([198.145.29.99]:51142 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1753513AbdELD3A (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 11 May 2017 23:29:00 -0400
Received: from mail-ua0-f178.google.com (mail-ua0-f178.google.com [209.85.217.178])
        (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
        (No client certificate requested)
        by mail.kernel.org (Postfix) with ESMTPSA id 9281B239A1
        for <kvm@vger.kernel.org>; Fri, 12 May 2017 03:28:59 +0000 (UTC)
Received: by mail-ua0-f178.google.com with SMTP id g49so40075648uaa.1
        for <kvm@vger.kernel.org>; Thu, 11 May 2017 20:28:59 -0700 (PDT)
In-Reply-To: <58dcdb2d-6894-b0a3-8d6f-2ab752fd6d22@linux.intel.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

[resending due to some kind of kernel.org glitch -- sorry if anyone
gets duplicates]

On Thu, May 11, 2017 at 5:32 PM, Huang, Kai <kai.huang@linux.intel.com> wrote:
> My current patch is based on this assumption. For KVM guest, naturally, we
> will write the cached value to real MSRs when vcpu is scheduled in. For
> host, SGX driver should write its own value to MSRs when it performs EINIT
> for LE.

This seems unnecessarily slow (perhaps *extremely* slow) to me.  I
would propose a totally different solution:

Have a percpu variable that stores the current SGXLEPUBKEYHASH along
with whatever lock is needed (probably just a mutex).  Users of EINIT
will take the mutex, compare the percpu variable to the desired value,
and, if it's different, do WRMSR and update the percpu variable.

KVM will implement writes to SGXLEPUBKEYHASH by updating its in-memory
state but *not* changing the MSRs.  KVM will trap and emulate EINIT to
support the same handling as the host.  There is no action required at
all on KVM guest entry and exit.

FWIW, I think that KVM will, in the long run, want to trap EINIT for
other reasons: someone is going to want to implement policy for what
enclaves are allowed that applies to guests as well as the host.
Also, some day Intel may fix its architectural design flaw [1] by
allowing EINIT to personalize the enclave's keying, and, if it's done
by a new argument to EINIT instead of an MSR, KVM will have to trap
EINIT to handle it.

>
> One argument against this approach is KVM guest should never have impact on
> host side, meaning host should not be aware of such MSR change

As a somewhat generic comment, I don't like this approach to KVM
development.  KVM mucks with lots of important architectural control
registers, and, in all too many cases, it tries to do so independently
of the other arch/x86 code.  This ends up causing all kinds of grief.

Can't KVM and the real x86 arch code cooperate for real?  The host and
the KVM code are in arch/x86 in the same source tree.

>
> 2. KVM should restore MSRs after changing for guest.

No, this is IMO silly.  Don't restore it on each exit, in the user
return hook, or anywhere else.  Just make sure the host knows it was
changed.

> Another thing is, not quite related to selecting which approach above, and
> either we choose approach 1 or approach 2, KVM still suffers the performance
> loss of writing (and/or reading) to IA32_SGXLEPUBKEYHASHn MSRs, either when
> vcpu scheduled in or during each VMEXIT/VMENTRY. Given the fact that the
> IA32_SGXLEPUBKEYHASHn will only be used by EINIT, We can actually do some
> optimization by trapping EINIT from guest and only update MSRs in EINIT
> VMEXIT.

Yep.

> But trapping ENCLS requires either 1) KVM to run ENCLS on hebalf of guest,
> in which case we have to reconstruct and remap guest's ENCLS parameters and
> skip the ENCLS for guest; 2) using MTF to let guest to run ENCLS again,
> while still trapping ENCLS.

I would advocate for the former approach.  (But you can't remap the
parameters due to TOCTOU issues, locking, etc.  Just copy them.  I
don't see why this is any more complicated than emulating any other
instruction that accesses memory.)

If necessary for some reason, trap EINIT when the SGXLEPUBKEYKASH is
wrong and then clear the exit flag once the MSRs are in sync.  You'll
need to be careful to avoid races in which the host's value leaks into
the guest.  I think you'll find that this is more complicated, less
flexible, and less performant than just handling ENCLS[EINIT] directly
in the host.

[1] Guests that steal sealed data from each other or from the host can
manipulate that data without compromising the hypervisor by simply
loading the same enclave that its rightful owner would use.  If you're
trying to use SGX to protect your crypto credentials so that, if
stolen, they can't be used outside the guest, I would consider this to
be a major flaw.  It breaks the security model in a multi-tenant cloud
situation.  I've complained about it before.