linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <graf@amazon.de>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Andra Paraschiv <andraprs@amazon.com>,
	<linux-kernel@vger.kernel.org>,
	Anthony Liguori <aliguori@amazon.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Colm MacCarthaigh <colmmacc@amazon.com>,
	"Bjoern Doebel" <doebel@amazon.de>,
	David Woodhouse <dwmw@amazon.co.uk>,
	"Frank van der Linden" <fllinden@amazon.com>,
	Martin Pohlack <mpohlack@amazon.de>,
	"Matt Wilson" <msw@amazon.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Balbir Singh <sblbir@amazon.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	Stewart Smith <trawets@amazon.com>,
	"Uwe Dannowski" <uwed@amazon.de>, <kvm@vger.kernel.org>,
	<ne-devel-upstream@amazon.com>
Subject: Re: [PATCH v3 07/18] nitro_enclaves: Init misc device providing the ioctl interface
Date: Thu, 28 May 2020 15:01:36 +0200	[thread overview]
Message-ID: <b4f17cbd-7471-fe61-6e7e-1399bd96e24e@amazon.de> (raw)
In-Reply-To: <20200526222402.GC179549@kroah.com>



On 27.05.20 00:24, Greg KH wrote:
> 
> On Tue, May 26, 2020 at 03:44:30PM +0200, Alexander Graf wrote:
>>
>>
>> On 26.05.20 15:17, Greg KH wrote:
>>>
>>> On Tue, May 26, 2020 at 02:44:18PM +0200, Alexander Graf wrote:
>>>>
>>>>
>>>> On 26.05.20 14:33, Greg KH wrote:
>>>>>
>>>>> On Tue, May 26, 2020 at 01:42:41PM +0200, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>> On 26.05.20 08:51, Greg KH wrote:
>>>>>>>
>>>>>>> On Tue, May 26, 2020 at 01:13:23AM +0300, Andra Paraschiv wrote:
>>>>>>>> +#define NE "nitro_enclaves: "
>>>>>>>
>>>>>>> Again, no need for this.
>>>>>>>
>>>>>>>> +#define NE_DEV_NAME "nitro_enclaves"
>>>>>>>
>>>>>>> KBUILD_MODNAME?
>>>>>>>
>>>>>>>> +#define NE_IMAGE_LOAD_OFFSET (8 * 1024UL * 1024UL)
>>>>>>>> +
>>>>>>>> +static char *ne_cpus;
>>>>>>>> +module_param(ne_cpus, charp, 0644);
>>>>>>>> +MODULE_PARM_DESC(ne_cpus, "<cpu-list> - CPU pool used for Nitro Enclaves");
>>>>>>>
>>>>>>> Again, please do not do this.
>>>>>>
>>>>>> I actually asked her to put this one in specifically.
>>>>>>
>>>>>> The concept of this parameter is very similar to isolcpus= and maxcpus= in
>>>>>> that it takes CPUs away from Linux and instead donates them to the
>>>>>> underlying hypervisor, so that it can spawn enclaves using them.
>>>>>>
>>>>>>    From an admin's point of view, this is a setting I would like to keep
>>>>>> persisted across reboots. How would this work with sysfs?
>>>>>
>>>>> How about just as the "initial" ioctl command to set things up?  Don't
>>>>> grab any cpu pools until asked to.  Otherwise, what happens when you
>>>>> load this module on a system that can't support it?
>>>>
>>>> That would give any user with access to the enclave device the ability to
>>>> remove CPUs from the system. That's clearly a CAP_ADMIN task in my book.
>>>
>>> Ok, what's wrong with that?
>>
>> Would you want random users to get the ability to hot unplug CPUs from your
>> system? At unlimited quantity? I don't :).
> 
> A random user, no, but one with admin rights, why not?  They can do that
> already today on your system, this isn't new.
> 
>>>> Hence this whole split: The admin defines the CPU Pool, users can safely
>>>> consume this pool to spawn enclaves from it.
>>>
>>> But having the admin define that at module load / boot time, is a major
>>> pain.  What tools do they have that allow them to do that easily?
>>
>> The normal toolbox: editing /etc/default/grub, adding an /etc/modprobe.d/
>> file.
> 
> Editing grub files is horrid, come on...

It's not editing grub files, it's editing template config files that 
then are used as input for grub config file generation :).

>> When but at module load / boot time would you define it? I really don't want
>> to have a device node that in theory "the world" can use which then allows
>> any user on the system to hot unplug every CPU but 0 from my system.
> 
> But you have that already when the PCI device is found, right?  What is
> the initial interface to the driver?  What's wrong with using that?
> 
> Or am I really missing something as to how this all fits together with
> the different pieces?  Seeing the patches as-is doesn't really provide a
> good overview, sorry.

Ok, let me walk you through the core donation process.

Imagine you have a parent VM with 8 cores. Every one of those virtual 
cores is 1:1 mapped to a physical core.

You enumerate the PCI device, you start working with it. None of that 
changes your topology.

You now create an enclave spanning 2 cores. The hypervisor will remove 
the 1:1 map for those 2 cores and instead mark them as "free floating" 
on the remaining 6 cores. It then uses the 2 freed up cores and creates 
a 1:1 map for the enclave's 2 cores

To ensure that we still see a realistic mapping of core topology, we 
need to remove those 2 cores from the parent VM's scope of execution. 
The way this is done today is by going through CPU offlining.

The first and obvious option would be to offline all respective CPUs 
when an enclave gets created. But unprivileged users should be able to 
spawn enclaves. So how do I ensure that my unprivileged user doesn't 
just offline all of my parent VM's CPUs?

The option implemented here is that we fold this into a two-stage 
approach. The admin reserves a "pool" of cores for enclaves to use. 
Unprivileged users can then consume cores from that pool, but not more 
than those.

That way, unprivileged users have no influence over whether a core is 
enabled or not. They can only consume the pool of cores that was 
dedicated for enclave use.

It also has the big advantage that you don't have dynamically changing 
CPU topology in your system. Long living processes that adjust their 
environment to the topology can still do so, without cores getting 
pulled out under their feet.

> 
>>>> So I really don't think an ioctl would be a great user experience. Same for
>>>> a sysfs file - although that's probably slightly better than the ioctl.
>>>
>>> You already are using ioctls to control this thing, right?  What's wrong
>>> with "one more"? :)
>>
>> So what we *could* do is add an ioctl to set the pool size which then does a
>> CAP_ADMIN check. That however means you now are in priority hell:
>>
>> A user that wants to spawn an enclave as part of an nginx service would need
>> to create another service to set the pool size and indicate the dependency
>> in systemd control files.
>>
>> Is that really better than a module parameter?
> 
> module parameters are hard to change, and manage control over who really
> is changing them.

What is hard about

$ echo 1-5 > /sys/module/nitro_enclaves/parameters/ne_cpus

I also fail to understand the argument about managing control over who 
is changing them. Only someone with CAP_ADMIN can change them, no? I 
feel like I'm missing something obvious in your argumentation :).


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




  reply	other threads:[~2020-05-28 13:01 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-25 22:13 [PATCH v3 00/18] Add support for Nitro Enclaves Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 01/18] nitro_enclaves: Add ioctl interface definition Andra Paraschiv
2020-05-27  8:49   ` Stefan Hajnoczi
2020-05-28 18:05     ` Paraschiv, Andra-Irina
2020-06-01  3:02     ` Benjamin Herrenschmidt
2020-06-01  7:20       ` Paraschiv, Andra-Irina
2020-06-05  8:15         ` Stefan Hajnoczi
2020-06-05 15:39           ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 02/18] nitro_enclaves: Define the PCI device interface Andra Paraschiv
2020-05-26  6:44   ` Greg KH
2020-05-26 17:01     ` Paraschiv, Andra-Irina
2020-05-26 22:21       ` Greg KH
2020-05-28 16:37         ` Paraschiv, Andra-Irina
2020-06-01  2:59         ` Benjamin Herrenschmidt
2020-06-01  7:07           ` Paraschiv, Andra-Irina
2020-06-01  2:53       ` Benjamin Herrenschmidt
2020-05-25 22:13 ` [PATCH v3 03/18] nitro_enclaves: Define enclave info for internal bookkeeping Andra Paraschiv
2020-05-26  6:46   ` Greg KH
2020-05-26 17:20     ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 04/18] nitro_enclaves: Init PCI device driver Andra Paraschiv
2020-05-26  6:48   ` Greg KH
2020-05-26 18:35     ` Paraschiv, Andra-Irina
2020-05-26 22:19       ` Greg KH
2020-05-28 16:26         ` Paraschiv, Andra-Irina
2020-06-01  2:55       ` Benjamin Herrenschmidt
2020-06-01  6:42         ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 05/18] nitro_enclaves: Handle PCI device command requests Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 06/18] nitro_enclaves: Handle out-of-band PCI device events Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 07/18] nitro_enclaves: Init misc device providing the ioctl interface Andra Paraschiv
2020-05-26  6:51   ` Greg KH
2020-05-26 11:42     ` Alexander Graf
2020-05-26 12:33       ` Greg KH
2020-05-26 12:44         ` Alexander Graf
2020-05-26 13:17           ` Greg KH
2020-05-26 13:44             ` Alexander Graf
2020-05-26 22:24               ` Greg KH
2020-05-28 13:01                 ` Alexander Graf [this message]
2020-05-28 13:12                   ` Greg KH
2020-05-28 17:06                     ` Paraschiv, Andra-Irina
2020-06-01  3:04                     ` Benjamin Herrenschmidt
2020-06-09 10:47                       ` Alexander Graf
2020-06-01  3:01                 ` Benjamin Herrenschmidt
2020-06-01  2:51           ` Benjamin Herrenschmidt
2020-05-26 13:40         ` Paraschiv, Andra-Irina
2020-06-01  2:47     ` Benjamin Herrenschmidt
2020-06-01  5:48       ` Greg KH
2020-05-25 22:13 ` [PATCH v3 08/18] nitro_enclaves: Add logic for enclave vm creation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 09/18] nitro_enclaves: Add logic for enclave vcpu creation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 10/18] nitro_enclaves: Add logic for enclave image load metadata Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 11/18] nitro_enclaves: Add logic for enclave memory region set Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 12/18] nitro_enclaves: Add logic for enclave start Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 13/18] nitro_enclaves: Add logic for enclave termination Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 14/18] nitro_enclaves: Add Kconfig for the Nitro Enclaves driver Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 15/18] nitro_enclaves: Add Makefile " Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 16/18] nitro_enclaves: Add sample for ioctl interface usage Andra Paraschiv
2020-05-26  6:52   ` Greg KH
2020-05-25 22:13 ` [PATCH v3 17/18] nitro_enclaves: Add overview documentation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 18/18] MAINTAINERS: Add entry for the Nitro Enclaves driver Andra Paraschiv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b4f17cbd-7471-fe61-6e7e-1399bd96e24e@amazon.de \
    --to=graf@amazon.de \
    --cc=aliguori@amazon.com \
    --cc=andraprs@amazon.com \
    --cc=benh@kernel.crashing.org \
    --cc=colmmacc@amazon.com \
    --cc=doebel@amazon.de \
    --cc=dwmw@amazon.co.uk \
    --cc=fllinden@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpohlack@amazon.de \
    --cc=msw@amazon.com \
    --cc=ne-devel-upstream@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=sblbir@amazon.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=trawets@amazon.com \
    --cc=uwed@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).