linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paraschiv, Andra-Irina" <andraprs@amazon.com>
To: Greg KH <gregkh@linuxfoundation.org>, Alexander Graf <graf@amazon.de>
Cc: <linux-kernel@vger.kernel.org>,
	Anthony Liguori <aliguori@amazon.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Colm MacCarthaigh <colmmacc@amazon.com>,
	Bjoern Doebel <doebel@amazon.de>,
	David Woodhouse <dwmw@amazon.co.uk>,
	Frank van der Linden <fllinden@amazon.com>,
	"Martin Pohlack" <mpohlack@amazon.de>,
	Matt Wilson <msw@amazon.com>, Paolo Bonzini <pbonzini@redhat.com>,
	Balbir Singh <sblbir@amazon.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Stewart Smith <trawets@amazon.com>,
	Uwe Dannowski <uwed@amazon.de>, <kvm@vger.kernel.org>,
	<ne-devel-upstream@amazon.com>
Subject: Re: [PATCH v3 07/18] nitro_enclaves: Init misc device providing the ioctl interface
Date: Thu, 28 May 2020 20:06:19 +0300	[thread overview]
Message-ID: <46b2e9c5-6186-5c37-197c-8acd0bce358e@amazon.com> (raw)
In-Reply-To: <20200528131259.GA3345766@kroah.com>



On 28/05/2020 16:12, Greg KH wrote:
>
> On Thu, May 28, 2020 at 03:01:36PM +0200, Alexander Graf wrote:
>>
>> On 27.05.20 00:24, Greg KH wrote:
>>> On Tue, May 26, 2020 at 03:44:30PM +0200, Alexander Graf wrote:
>>>>
>>>> On 26.05.20 15:17, Greg KH wrote:
>>>>> On Tue, May 26, 2020 at 02:44:18PM +0200, Alexander Graf wrote:
>>>>>>
>>>>>> On 26.05.20 14:33, Greg KH wrote:
>>>>>>> On Tue, May 26, 2020 at 01:42:41PM +0200, Alexander Graf wrote:
>>>>>>>>
>>>>>>>> On 26.05.20 08:51, Greg KH wrote:
>>>>>>>>> On Tue, May 26, 2020 at 01:13:23AM +0300, Andra Paraschiv wrote:
>>>>>>>>>> +#define NE "nitro_enclaves: "
>>>>>>>>> Again, no need for this.
>>>>>>>>>
>>>>>>>>>> +#define NE_DEV_NAME "nitro_enclaves"
>>>>>>>>> KBUILD_MODNAME?
>>>>>>>>>
>>>>>>>>>> +#define NE_IMAGE_LOAD_OFFSET (8 * 1024UL * 1024UL)
>>>>>>>>>> +
>>>>>>>>>> +static char *ne_cpus;
>>>>>>>>>> +module_param(ne_cpus, charp, 0644);
>>>>>>>>>> +MODULE_PARM_DESC(ne_cpus, "<cpu-list> - CPU pool used for Nitro Enclaves");
>>>>>>>>> Again, please do not do this.
>>>>>>>> I actually asked her to put this one in specifically.
>>>>>>>>
>>>>>>>> The concept of this parameter is very similar to isolcpus= and maxcpus= in
>>>>>>>> that it takes CPUs away from Linux and instead donates them to the
>>>>>>>> underlying hypervisor, so that it can spawn enclaves using them.
>>>>>>>>
>>>>>>>>     From an admin's point of view, this is a setting I would like to keep
>>>>>>>> persisted across reboots. How would this work with sysfs?
>>>>>>> How about just as the "initial" ioctl command to set things up?  Don't
>>>>>>> grab any cpu pools until asked to.  Otherwise, what happens when you
>>>>>>> load this module on a system that can't support it?
>>>>>> That would give any user with access to the enclave device the ability to
>>>>>> remove CPUs from the system. That's clearly a CAP_ADMIN task in my book.
>>>>> Ok, what's wrong with that?
>>>> Would you want random users to get the ability to hot unplug CPUs from your
>>>> system? At unlimited quantity? I don't :).
>>> A random user, no, but one with admin rights, why not?  They can do that
>>> already today on your system, this isn't new.
>>>
>>>>>> Hence this whole split: The admin defines the CPU Pool, users can safely
>>>>>> consume this pool to spawn enclaves from it.
>>>>> But having the admin define that at module load / boot time, is a major
>>>>> pain.  What tools do they have that allow them to do that easily?
>>>> The normal toolbox: editing /etc/default/grub, adding an /etc/modprobe.d/
>>>> file.
>>> Editing grub files is horrid, come on...
>> It's not editing grub files, it's editing template config files that then
>> are used as input for grub config file generation :).
>>
>>>> When but at module load / boot time would you define it? I really don't want
>>>> to have a device node that in theory "the world" can use which then allows
>>>> any user on the system to hot unplug every CPU but 0 from my system.
>>> But you have that already when the PCI device is found, right?  What is
>>> the initial interface to the driver?  What's wrong with using that?
>>>
>>> Or am I really missing something as to how this all fits together with
>>> the different pieces?  Seeing the patches as-is doesn't really provide a
>>> good overview, sorry.
>> Ok, let me walk you through the core donation process.
>>
>> Imagine you have a parent VM with 8 cores. Every one of those virtual cores
>> is 1:1 mapped to a physical core.
>>
>> You enumerate the PCI device, you start working with it. None of that
>> changes your topology.
>>
>> You now create an enclave spanning 2 cores. The hypervisor will remove the
>> 1:1 map for those 2 cores and instead mark them as "free floating" on the
>> remaining 6 cores. It then uses the 2 freed up cores and creates a 1:1 map
>> for the enclave's 2 cores
>>
>> To ensure that we still see a realistic mapping of core topology, we need to
>> remove those 2 cores from the parent VM's scope of execution. The way this
>> is done today is by going through CPU offlining.
>>
>> The first and obvious option would be to offline all respective CPUs when an
>> enclave gets created. But unprivileged users should be able to spawn
>> enclaves. So how do I ensure that my unprivileged user doesn't just offline
>> all of my parent VM's CPUs?
>>
>> The option implemented here is that we fold this into a two-stage approach.
>> The admin reserves a "pool" of cores for enclaves to use. Unprivileged users
>> can then consume cores from that pool, but not more than those.
>>
>> That way, unprivileged users have no influence over whether a core is
>> enabled or not. They can only consume the pool of cores that was dedicated
>> for enclave use.
>>
>> It also has the big advantage that you don't have dynamically changing CPU
>> topology in your system. Long living processes that adjust their environment
>> to the topology can still do so, without cores getting pulled out under
>> their feet.
> Ok, that makes more sense, but:
>
>>>>>> So I really don't think an ioctl would be a great user experience. Same for
>>>>>> a sysfs file - although that's probably slightly better than the ioctl.
>>>>> You already are using ioctls to control this thing, right?  What's wrong
>>>>> with "one more"? :)
>>>> So what we *could* do is add an ioctl to set the pool size which then does a
>>>> CAP_ADMIN check. That however means you now are in priority hell:
>>>>
>>>> A user that wants to spawn an enclave as part of an nginx service would need
>>>> to create another service to set the pool size and indicate the dependency
>>>> in systemd control files.
>>>>
>>>> Is that really better than a module parameter?
>>> module parameters are hard to change, and manage control over who really
>>> is changing them.
>> What is hard about
>>
>> $ echo 1-5 > /sys/module/nitro_enclaves/parameters/ne_cpus
> So at runtime, after all is booted and up and going, you just ripped
> cores out from under someone's feet?  :)
>
> And the code really handles writing to that value while the module is
> already loaded and up and running?  At a quick glance, it didn't seem
> like it would handle that very well as it only is checked at ne_init()
> time.
>
> Or am I missing something?

It's checked for now at module init, true.

I started with init and it remained as a TODO on my side to adapt the 
logic to be able to handle the setup via the sysfs file for the module.


Thanks,
Andra

>
> Anyway, yes, if you can dynamically do this at runtime, that's great,
> but it feels ackward to me to rely on one configuration thing as a
> module parameter, and everything else through the ioctl interface.
> Unification would seem to be a good thing, right?
>
> thanks,
>
> greg k-h




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

  reply	other threads:[~2020-05-28 17:09 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-25 22:13 [PATCH v3 00/18] Add support for Nitro Enclaves Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 01/18] nitro_enclaves: Add ioctl interface definition Andra Paraschiv
2020-05-27  8:49   ` Stefan Hajnoczi
2020-05-28 18:05     ` Paraschiv, Andra-Irina
2020-06-01  3:02     ` Benjamin Herrenschmidt
2020-06-01  7:20       ` Paraschiv, Andra-Irina
2020-06-05  8:15         ` Stefan Hajnoczi
2020-06-05 15:39           ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 02/18] nitro_enclaves: Define the PCI device interface Andra Paraschiv
2020-05-26  6:44   ` Greg KH
2020-05-26 17:01     ` Paraschiv, Andra-Irina
2020-05-26 22:21       ` Greg KH
2020-05-28 16:37         ` Paraschiv, Andra-Irina
2020-06-01  2:59         ` Benjamin Herrenschmidt
2020-06-01  7:07           ` Paraschiv, Andra-Irina
2020-06-01  2:53       ` Benjamin Herrenschmidt
2020-05-25 22:13 ` [PATCH v3 03/18] nitro_enclaves: Define enclave info for internal bookkeeping Andra Paraschiv
2020-05-26  6:46   ` Greg KH
2020-05-26 17:20     ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 04/18] nitro_enclaves: Init PCI device driver Andra Paraschiv
2020-05-26  6:48   ` Greg KH
2020-05-26 18:35     ` Paraschiv, Andra-Irina
2020-05-26 22:19       ` Greg KH
2020-05-28 16:26         ` Paraschiv, Andra-Irina
2020-06-01  2:55       ` Benjamin Herrenschmidt
2020-06-01  6:42         ` Paraschiv, Andra-Irina
2020-05-25 22:13 ` [PATCH v3 05/18] nitro_enclaves: Handle PCI device command requests Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 06/18] nitro_enclaves: Handle out-of-band PCI device events Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 07/18] nitro_enclaves: Init misc device providing the ioctl interface Andra Paraschiv
2020-05-26  6:51   ` Greg KH
2020-05-26 11:42     ` Alexander Graf
2020-05-26 12:33       ` Greg KH
2020-05-26 12:44         ` Alexander Graf
2020-05-26 13:17           ` Greg KH
2020-05-26 13:44             ` Alexander Graf
2020-05-26 22:24               ` Greg KH
2020-05-28 13:01                 ` Alexander Graf
2020-05-28 13:12                   ` Greg KH
2020-05-28 17:06                     ` Paraschiv, Andra-Irina [this message]
2020-06-01  3:04                     ` Benjamin Herrenschmidt
2020-06-09 10:47                       ` Alexander Graf
2020-06-01  3:01                 ` Benjamin Herrenschmidt
2020-06-01  2:51           ` Benjamin Herrenschmidt
2020-05-26 13:40         ` Paraschiv, Andra-Irina
2020-06-01  2:47     ` Benjamin Herrenschmidt
2020-06-01  5:48       ` Greg KH
2020-05-25 22:13 ` [PATCH v3 08/18] nitro_enclaves: Add logic for enclave vm creation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 09/18] nitro_enclaves: Add logic for enclave vcpu creation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 10/18] nitro_enclaves: Add logic for enclave image load metadata Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 11/18] nitro_enclaves: Add logic for enclave memory region set Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 12/18] nitro_enclaves: Add logic for enclave start Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 13/18] nitro_enclaves: Add logic for enclave termination Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 14/18] nitro_enclaves: Add Kconfig for the Nitro Enclaves driver Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 15/18] nitro_enclaves: Add Makefile " Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 16/18] nitro_enclaves: Add sample for ioctl interface usage Andra Paraschiv
2020-05-26  6:52   ` Greg KH
2020-05-25 22:13 ` [PATCH v3 17/18] nitro_enclaves: Add overview documentation Andra Paraschiv
2020-05-25 22:13 ` [PATCH v3 18/18] MAINTAINERS: Add entry for the Nitro Enclaves driver Andra Paraschiv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46b2e9c5-6186-5c37-197c-8acd0bce358e@amazon.com \
    --to=andraprs@amazon.com \
    --cc=aliguori@amazon.com \
    --cc=benh@kernel.crashing.org \
    --cc=colmmacc@amazon.com \
    --cc=doebel@amazon.de \
    --cc=dwmw@amazon.co.uk \
    --cc=fllinden@amazon.com \
    --cc=graf@amazon.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpohlack@amazon.de \
    --cc=msw@amazon.com \
    --cc=ne-devel-upstream@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=sblbir@amazon.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=trawets@amazon.com \
    --cc=uwed@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).