LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Liran Alon <liran.alon@oracle.com>
To: Alexander Graf <graf@amazon.de>,
	"Paraschiv, Andra-Irina" <andraprs@amazon.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>,
	Benjamin Herrenschmidt <benh@amazon.com>,
	Colm MacCarthaigh <colmmacc@amazon.com>,
	Bjoern Doebel <doebel@amazon.de>,
	David Woodhouse <dwmw@amazon.co.uk>,
	Frank van der Linden <fllinden@amazon.com>,
	Martin Pohlack <mpohlack@amazon.de>, Matt Wilson <msw@amazon.com>,
	Balbir Singh <sblbir@amazon.com>,
	Stewart Smith <trawets@amazon.com>,
	Uwe Dannowski <uwed@amazon.de>,
	kvm@vger.kernel.org, ne-devel-upstream@amazon.com
Subject: Re: [PATCH v1 00/15] Add support for Nitro Enclaves
Date: Tue, 28 Apr 2020 19:01:45 +0300
Message-ID: <6789d559-cb57-24b5-2c4a-57c6d18133a2@oracle.com> (raw)
In-Reply-To: <50f58a36-76ee-5e97-f5e6-1f08bee0c596@amazon.de>


On 28/04/2020 18:25, Alexander Graf wrote:
>
>
> On 27.04.20 13:44, Liran Alon wrote:
>>
>> On 27/04/2020 10:56, Paraschiv, Andra-Irina wrote:
>>>
>>> On 25/04/2020 18:25, Liran Alon wrote:
>>>>
>>>> On 23/04/2020 16:19, Paraschiv, Andra-Irina wrote:
>>>>>
>>>>> The memory and CPUs are carved out of the primary VM, they are
>>>>> dedicated for the enclave. The Nitro hypervisor running on the host
>>>>> ensures memory and CPU isolation between the primary VM and the
>>>>> enclave VM.
>>>> I hope you properly take into consideration Hyper-Threading
>>>> speculative side-channel vulnerabilities here.
>>>> i.e. Usually cloud providers designate each CPU core to be assigned
>>>> to run only vCPUs of specific guest. To avoid sharing a single CPU
>>>> core between multiple guests.
>>>> To handle this properly, you need to use some kind of core-scheduling
>>>> mechanism (Such that each CPU core either runs only vCPUs of enclave
>>>> or only vCPUs of primary VM at any given point in time).
>>>>
>>>> In addition, can you elaborate more on how the enclave memory is
>>>> carved out of the primary VM?
>>>> Does this involve performing a memory hot-unplug operation from
>>>> primary VM or just unmap enclave-assigned guest physical pages from
>>>> primary VM's SLAT (EPT/NPT) and map them now only in enclave's SLAT?
>>>
>>> Correct, we take into consideration the HT setup. The enclave gets
>>> dedicated physical cores. The primary VM and the enclave VM don't run
>>> on CPU siblings of a physical core.
>> The way I would imagine this to work is that Primary-VM just specifies
>> how many vCPUs will the Enclave-VM have and those vCPUs will be set with
>> affinity to run on same physical CPU cores as Primary-VM.
>> But with the exception that scheduler is modified to not run vCPUs of
>> Primary-VM and Enclave-VM as sibling on the same physical CPU core
>> (core-scheduling). i.e. This is different than primary-VM losing
>> physical CPU cores permanently as long as the Enclave-VM is running.
>> Or maybe this should even be controlled by a knob in virtual PCI device
>> interface to allow flexibility to customer to decide if Enclave-VM needs
>> dedicated CPU cores or is it ok to share them with Primary-VM
>> as long as core-scheduling is used to guarantee proper isolation.
>
> Running both parent and enclave on the same core can *potentially* 
> lead to L2 cache leakage, so we decided not to go with it :).
Haven't thought about the L2 cache. Makes sense. Ack.
>
>>>
>>> Regarding the memory carve out, the logic includes page table entries
>>> handling.
>> As I thought. Thanks for conformation.
>>>
>>> IIRC, memory hot-unplug can be used for the memory blocks that were
>>> previously hot-plugged.
>>>
>>> https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/mm/memory-hotplug.html__;!!GqivPVa7Brio!MubgaBjJabDtNzNpdOxxbSKtLbqXHbsEpTtZ1mj-rnfLvMIbLW1nZ8cK10GhYJQ$ 
>>>
>>>
>>>>
>>>> I don't quite understand why Enclave VM needs to be
>>>> provisioned/teardown during primary VM's runtime.
>>>>
>>>> For example, an alternative could have been to just provision both
>>>> primary VM and Enclave VM on primary VM startup.
>>>> Then, wait for primary VM to setup a communication channel with
>>>> Enclave VM (E.g. via virtio-vsock).
>>>> Then, primary VM is free to request Enclave VM to perform various
>>>> tasks when required on the isolated environment.
>>>>
>>>> Such setup will mimic a common Enclave setup. Such as Microsoft
>>>> Windows VBS EPT-based Enclaves (That all runs on VTL1). It is also
>>>> similar to TEEs running on ARM TrustZone.
>>>> i.e. In my alternative proposed solution, the Enclave VM is similar
>>>> to VTL1/TrustZone.
>>>> It will also avoid requiring introducing a new PCI device and driver.
>>>
>>> True, this can be another option, to provision the primary VM and the
>>> enclave VM at launch time.
>>>
>>> In the proposed setup, the primary VM starts with the initial
>>> allocated resources (memory, CPUs). The launch path of the enclave VM,
>>> as it's spawned on the same host, is done via the ioctl interface -
>>> PCI device - host hypervisor path. Short-running or long-running
>>> enclave can be bootstrapped during primary VM lifetime. Depending on
>>> the use case, a custom set of resources (memory and CPUs) is set for
>>> an enclave and then given back when the enclave is terminated; these
>>> resources can be used for another enclave spawned later on or the
>>> primary VM tasks.
>>>
>> Yes, I already understood this is how the mechanism work. I'm
>> questioning whether this is indeed a good approach that should also be
>> taken by upstream.
>
> I thought the point of Linux was to support devices that exist, rather 
> than change the way the world works around it? ;)
I agree. Just poking around to see if upstream wants to implement a 
different approach for Enclaves, regardless of accepting the Nitro 
Enclave virtual PCI driver for AWS use-case of course.
>
>> The use-case of using Nitro Enclaves is for a Confidential-Computing
>> service. i.e. The ability to provision a compute instance that can be
>> trusted to perform a bunch of computation on sensitive
>> information with high confidence that it cannot be compromised as it's
>> highly isolated. Some technologies such as Intel SGX and AMD SEV
>> attempted to achieve this even with guarantees that
>> the computation is isolated from the hardware and hypervisor itself.
>
> Yeah, that worked really well, didn't it? ;)
You haven't seen me saying SGX worked well. :)
AMD SEV though still have it's shot (Once SEV-SNP will be GA).
>
>> I would have expected that for the vast majority of real customer
>> use-cases, the customer will provision a compute instance that runs some
>> confidential-computing task in an enclave which it
>> keeps running for the entire life-time of the compute instance. As the
>> sole purpose of the compute instance is to just expose a service that
>> performs some confidential-computing task.
>> For those cases, it should have been sufficient to just pre-provision a
>> single Enclave-VM that performs this task, together with the compute
>> instance and connect them via virtio-vsock.
>> Without introducing any new virtual PCI device, guest PCI driver and
>> unique semantics of stealing resources (CPUs and Memory) from primary-VM
>> at runtime.
>
> You would also need to preprovision the image that runs in the 
> enclave, which is usually only determined at runtime. For that you 
> need the PCI driver anyway, so why not make the creation dynamic too?
The image doesn't have to be determined at runtime. It could be supplied 
to control-plane. As mentioned below.
>
>> In this Nitro Enclave architecture, we de-facto put Compute
>> control-plane abilities in the hands of the guest VM. Instead of
>> introducing new control-plane primitives that allows building
>> the data-plane architecture desired by the customer in a flexible 
>> manner.
>> * What if the customer prefers to have it's Enclave VM polling S3 bucket
>> for new tasks and produce results to S3 as-well? Without having any
>> "Primary-VM" or virtio-vsock connection of any kind?
>> * What if for some use-cases customer wants Enclave-VM to have dedicated
>> compute power (i.e. Not share physical CPU cores with primary-VM. Not
>> even with core-scheduling) but for other
>> use-cases, customer prefers to share physical CPU cores with Primary-VM
>> (Together with core-scheduling guarantees)? (Although this could be
>> addressed by extending the virtual PCI device
>> interface with a knob to control this)
>>
>> An alternative would have been to have the following new control-plane
>> primitives:
>> * Ability to provision a VM without boot-volume, but instead from an
>> Image that is used to boot from memory. Allowing to provision 
>> disk-less VMs.
>>    (E.g. Can be useful for other use-cases such as VMs not requiring EBS
>> at all which could allow cheaper compute instance)
>> * Ability to provision a group of VMs together as a group such that they
>> are guaranteed to launch as sibling VMs on the same host.
>> * Ability to create a fast-path connection between sibling VMs on the
>> same host with virtio-vsock. Or even also other shared-memory mechanism.
>> * Extend AWS Fargate with ability to run multiple microVMs as a group
>> (Similar to above) connected with virtio-vsock. To allow on-demand scale
>> of confidential-computing task.
>
> Yes, there are a *lot* of different ways to implement enclaves in a 
> cloud environment. This is the one that we focused on, but I'm sure 
> others in the space will have more ideas. It's definitely an 
> interesting space and I'm eager to see more innovation happening :).
>
>> Having said that, I do see a similar architecture to Nitro Enclaves
>> virtual PCI device used for a different purpose: For hypervisor-based
>> security isolation (Such as Windows VBS).
>> E.g. Linux boot-loader can detect the presence of this virtual PCI
>> device and use it to provision multiple VM security domains. Such that
>> when a security domain is created,
>> it is specified what is the hardware resources it have access to (Guest
>> memory pages, IOPorts, MSRs and etc.) and the blob it should run to
>> bootstrap. Similar, but superior than,
>> Hyper-V VSM. In addition, some security domains will be given special
>> abilities to control other security domains (For example, to control the
>> +XS,+XU EPT bits of other security
>> domains to enforce code-integrity. Similar to Windows VBS HVCI). Just an
>> idea... :)
>
> Yes, absolutely! So much fun to be had :D

:)

-Liran

>
>
> Alex
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>

      reply index

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 18:41 Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 01/15] nitro_enclaves: Add ioctl interface definition Andra Paraschiv
2020-04-21 18:47   ` Randy Dunlap
2020-04-21 21:45     ` Paolo Bonzini
2020-04-22 15:49       ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 02/15] nitro_enclaves: Define the PCI device interface Andra Paraschiv
2020-04-21 21:22   ` Paolo Bonzini
2020-04-23 13:37     ` Paraschiv, Andra-Irina
2020-04-24 15:10       ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 03/15] nitro_enclaves: Define enclave info for internal bookkeeping Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 04/15] nitro_enclaves: Init PCI device driver Andra Paraschiv
2020-04-25 14:25   ` Liran Alon
2020-04-29 16:31     ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 05/15] nitro_enclaves: Handle PCI device command requests Andra Paraschiv
2020-04-25 14:52   ` Liran Alon
2020-04-29 17:00     ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 06/15] nitro_enclaves: Handle out-of-band PCI device events Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 07/15] nitro_enclaves: Init misc device providing the ioctl interface Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 08/15] nitro_enclaves: Add logic for enclave vm creation Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 09/15] nitro_enclaves: Add logic for enclave vcpu creation Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 10/15] nitro_enclaves: Add logic for enclave memory region set Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 11/15] nitro_enclaves: Add logic for enclave start Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 12/15] nitro_enclaves: Add logic for enclave termination Andra Paraschiv
2020-04-21 18:41 ` [PATCH v1 13/15] nitro_enclaves: Add Kconfig for the Nitro Enclaves driver Andra Paraschiv
2020-04-21 18:50   ` Randy Dunlap
2020-04-22 14:35     ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 14/15] nitro_enclaves: Add Makefile " Andra Paraschiv
2020-04-23  8:12   ` kbuild test robot
2020-04-24 17:00     ` Paraschiv, Andra-Irina
2020-04-23  8:43   ` kbuild test robot
2020-04-24 15:27     ` Paraschiv, Andra-Irina
2020-04-21 18:41 ` [PATCH v1 15/15] MAINTAINERS: Add entry " Andra Paraschiv
2020-04-21 21:46 ` [PATCH v1 00/15] Add support for Nitro Enclaves Paolo Bonzini
2020-04-23 13:19   ` Paraschiv, Andra-Irina
2020-04-23 13:42     ` Paolo Bonzini
2020-04-23 17:42       ` Paraschiv, Andra-Irina
2020-04-23 17:51         ` Paolo Bonzini
2020-04-23 20:56           ` Alexander Graf
2020-04-23 21:18             ` Paolo Bonzini
2020-04-24 12:56               ` Alexander Graf
2020-04-24 16:27                 ` Paolo Bonzini
2020-04-24 19:11                   ` Alexander Graf
2020-04-25 16:05                     ` Paolo Bonzini
2020-04-27  9:15                       ` Paraschiv, Andra-Irina
2020-04-27  9:22                       ` Paraschiv, Andra-Irina
2020-04-27  9:46                         ` Paolo Bonzini
2020-04-27 10:00                           ` Paraschiv, Andra-Irina
2020-04-28 15:07                       ` Alexander Graf
2020-04-29 13:20                         ` Paolo Bonzini
2020-04-30 13:59                           ` Paraschiv, Andra-Irina
2020-04-30 10:34                         ` Paolo Bonzini
2020-04-30 11:21                           ` Alexander Graf
2020-04-30 11:38                             ` Paolo Bonzini
2020-04-30 11:47                               ` Alexander Graf
2020-04-30 11:58                                 ` Paolo Bonzini
2020-04-30 12:19                                   ` Alexander Graf
2020-05-07 17:44       ` Pavel Machek
2020-05-08  7:00         ` Paraschiv, Andra-Irina
2020-05-09 19:21           ` Pavel Machek
2020-05-10 11:02             ` Herrenschmidt, Benjamin
2020-05-11 10:49               ` Paraschiv, Andra-Irina
2020-05-11 13:49               ` Stefan Hajnoczi
2020-04-24  3:04     ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-04-24  8:19       ` Paraschiv, Andra-Irina
2020-04-24  9:54         ` Paraschiv, Andra-Irina
2020-04-26  1:55           ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-04-27 18:39             ` Paraschiv, Andra-Irina
2020-04-24  9:59     ` Tian, Kevin
2020-04-24 13:59       ` Paraschiv, Andra-Irina
2020-04-26  8:16         ` Tian, Kevin
2020-04-27 19:05           ` Paraschiv, Andra-Irina
     [not found]         ` <CAKXe6SLonLQLAOY9Q_2AzTeg4uJxiknsAWnJpTF0hMcXEG5Tew@mail.gmail.com>
2020-05-11 12:05           ` Paraschiv, Andra-Irina
2020-04-25 15:25     ` Liran Alon
2020-04-27  7:56       ` Paraschiv, Andra-Irina
2020-04-27 11:44         ` Liran Alon
2020-04-28 15:25           ` Alexander Graf
2020-04-28 16:01             ` Liran Alon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6789d559-cb57-24b5-2c4a-57c6d18133a2@oracle.com \
    --to=liran.alon@oracle.com \
    --cc=aliguori@amazon.com \
    --cc=andraprs@amazon.com \
    --cc=benh@amazon.com \
    --cc=colmmacc@amazon.com \
    --cc=doebel@amazon.de \
    --cc=dwmw@amazon.co.uk \
    --cc=fllinden@amazon.com \
    --cc=graf@amazon.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpohlack@amazon.de \
    --cc=msw@amazon.com \
    --cc=ne-devel-upstream@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=sblbir@amazon.com \
    --cc=trawets@amazon.com \
    --cc=uwed@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git