All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
@ 2021-09-13 13:11 Paolo Bonzini
  2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 13:11 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

Based on discussions from the previous week(end), this series implements
a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc
file descriptor.  Other possibilities, such as closing and reopening
the device, are racy.

The patches are untested, but I am posting them because they are simple
and so that Yang Zhong can try using them in QEMU.

Paolo

Paolo Bonzini (2):
  x86: sgx_vepc: extract sgx_vepc_remove_page
  x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl

 arch/x86/include/uapi/asm/sgx.h |  2 ++
 arch/x86/kernel/cpu/sgx/virt.c  | 48 ++++++++++++++++++++++++++++++---
 2 files changed, 47 insertions(+), 3 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 13:11 [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Paolo Bonzini
@ 2021-09-13 13:11 ` Paolo Bonzini
  2021-09-13 14:05   ` Dave Hansen
  2021-09-13 20:33   ` Jarkko Sakkinen
  2021-09-13 13:11 ` [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl Paolo Bonzini
  2021-09-14  7:10 ` [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Yang Zhong
  2 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 13:11 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

Windows expects all pages to be in uninitialized state on startup.
In order to implement this, we will need a ioctl that performs
EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor:
other possibilities, such as closing and reopening the device,
are racy.

Start the implementation by pulling the EREMOVE into a separate
function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/cpu/sgx/virt.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 64511c4a5200..59b9c13121cd 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -111,7 +111,7 @@ static int sgx_vepc_mmap(struct file *file, struct vm_area_struct *vma)
 	return 0;
 }
 
-static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
+static int sgx_vepc_remove_page(struct sgx_epc_page *epc_page)
 {
 	int ret;
 
@@ -140,11 +140,17 @@ static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
 		 */
 		WARN_ONCE(ret != SGX_CHILD_PRESENT, EREMOVE_ERROR_MESSAGE,
 			  ret, ret);
-		return ret;
 	}
+	return ret;
+}
 
-	sgx_free_epc_page(epc_page);
+static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
+{
+	int ret = sgx_vepc_remove_page(epc_page);
+	if (ret)
+		return ret;
 
+	sgx_free_epc_page(epc_page);
 	return 0;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
  2021-09-13 13:11 [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Paolo Bonzini
  2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
@ 2021-09-13 13:11 ` Paolo Bonzini
  2021-09-13 19:33   ` Dave Hansen
  2021-09-14 10:55   ` Kai Huang
  2021-09-14  7:10 ` [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Yang Zhong
  2 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 13:11 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

Windows expects all pages to be in uninitialized state on startup.
Add a ioctl that does this with EREMOVE, so that userspace can bring
the pages back to this state also when resetting the VM.
Pure userspace implementations, such as closing and reopening the device,
are racy.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/uapi/asm/sgx.h |  2 ++
 arch/x86/kernel/cpu/sgx/virt.c  | 36 +++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 9690d6899ad9..f79d84ce8033 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -27,6 +27,8 @@ enum sgx_page_flags {
 	_IOW(SGX_MAGIC, 0x02, struct sgx_enclave_init)
 #define SGX_IOC_ENCLAVE_PROVISION \
 	_IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
+#define SGX_IOC_VEPC_REMOVE \
+	_IO(SGX_MAGIC, 0x04)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 59b9c13121cd..81dc186fda2e 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -154,6 +154,24 @@ static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
 	return 0;
 }
 
+static long sgx_vepc_remove_all(struct sgx_vepc *vepc)
+{
+	struct sgx_epc_page *entry;
+	unsigned long index;
+	long failures = 0;
+
+	xa_for_each(&vepc->page_array, index, entry)
+		if (sgx_vepc_remove_page(entry))
+			failures++;
+
+	/*
+	 * Return the number of pages that failed to be removed, so
+	 * userspace knows that there are still SECS pages lying
+	 * around.
+	 */
+	return failures;
+}
+
 static int sgx_vepc_release(struct inode *inode, struct file *file)
 {
 	struct sgx_vepc *vepc = file->private_data;
@@ -239,9 +257,27 @@ static int sgx_vepc_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+static long sgx_vepc_ioctl(struct file *file,
+			   unsigned int cmd, unsigned long arg)
+{
+	struct sgx_vepc *vepc = file->private_data;
+
+	switch (cmd) {
+	case SGX_IOC_VEPC_REMOVE:
+		if (arg)
+			return -EINVAL;
+		return sgx_vepc_remove_all(vepc);
+
+	default:
+		return -ENOTTY;
+	}
+}
+
 static const struct file_operations sgx_vepc_fops = {
 	.owner		= THIS_MODULE,
 	.open		= sgx_vepc_open,
+	.unlocked_ioctl	= sgx_vepc_ioctl,
+	.compat_ioctl	= sgx_vepc_ioctl,
 	.release	= sgx_vepc_release,
 	.mmap		= sgx_vepc_mmap,
 };
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
@ 2021-09-13 14:05   ` Dave Hansen
  2021-09-13 14:24     ` Paolo Bonzini
  2021-09-13 20:33   ` Jarkko Sakkinen
  1 sibling, 1 reply; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 14:05 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 9/13/21 6:11 AM, Paolo Bonzini wrote:
> Windows expects all pages to be in uninitialized state on startup.
> In order to implement this, we will need a ioctl that performs
> EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor:
> other possibilities, such as closing and reopening the device,
> are racy.

Hi Paolo,

How does this end up happening in the first place?

All enclave pages should start out on 'sgx_dirty_page_list' and
ksgxd sanitizes them with EREMOVE before making them available.  That
should cover EREMOVE after reboots while SGX pages are initialized,
including kexec().

sgx_vepc_free_page() should do the same for pages that a guest not not
clean up properly.

sgx_encl_free_epc_page() does an EREMOVE after a normal enclave has used
a page.

Those are the only three cases that I can think of.  So, it sounds like
one of those is buggy, or there's another unexpected path out there.
Ultimately, I think it would be really handy if we could do this EREMOVE
implicitly and without any new ABI.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 14:05   ` Dave Hansen
@ 2021-09-13 14:24     ` Paolo Bonzini
  2021-09-13 14:55       ` Dave Hansen
  2021-09-13 21:00       ` Jarkko Sakkinen
  0 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 14:24 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 13/09/21 16:05, Dave Hansen wrote:
> On 9/13/21 6:11 AM, Paolo Bonzini wrote:
>> Windows expects all pages to be in uninitialized state on startup.
>> In order to implement this, we will need a ioctl that performs
>> EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor:
>> other possibilities, such as closing and reopening the device,
>> are racy.
> 
> Hi Paolo,
> 
> How does this end up happening in the first place?
> 
> All enclave pages should start out on 'sgx_dirty_page_list' and
> ksgxd sanitizes them with EREMOVE before making them available.  That
> should cover EREMOVE after reboots while SGX pages are initialized,
> including kexec().

By "Windows startup" I mean even after guest reboot.  Because another 
process could sneak in and steal your EPC pages between a close() and an 
open(), I'd like to have a way to EREMOVE the pages while keeping them 
assigned to the specific vEPC instance, i.e. *without* going through 
sgx_vepc_free_page().

Thanks,

Paolo

> sgx_vepc_free_page() should do the same for pages that a guest not not
> clean up properly.
> 
> sgx_encl_free_epc_page() does an EREMOVE after a normal enclave has used
> a page.
> 
> Those are the only three cases that I can think of.  So, it sounds like
> one of those is buggy, or there's another unexpected path out there.
> Ultimately, I think it would be really handy if we could do this EREMOVE
> implicitly and without any new ABI.
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 14:24     ` Paolo Bonzini
@ 2021-09-13 14:55       ` Dave Hansen
  2021-09-13 15:14         ` Paolo Bonzini
  2021-09-13 21:12         ` Jarkko Sakkinen
  2021-09-13 21:00       ` Jarkko Sakkinen
  1 sibling, 2 replies; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 14:55 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 9/13/21 7:24 AM, Paolo Bonzini wrote:
>> How does this end up happening in the first place?
>>
>> All enclave pages should start out on 'sgx_dirty_page_list' and
>> ksgxd sanitizes them with EREMOVE before making them available.  That
>> should cover EREMOVE after reboots while SGX pages are initialized,
>> including kexec().
> 
> By "Windows startup" I mean even after guest reboot.  Because another
> process could sneak in and steal your EPC pages between a close() and an
> open(), I'd like to have a way to EREMOVE the pages while keeping them
> assigned to the specific vEPC instance, i.e. *without* going through
> sgx_vepc_free_page().

Oh, so you want fresh EPC state for the guest, but you're concerned that
the previous guest might have left them in a bad state.  The current
method of getting a new vepc instance (which guarantees fresh state) has
some other downsides.

Can't another process steal pages via sgxd and reclaim at any time?
What's the extra concern here about going through a close()/open()
cycle?  Performance?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 14:55       ` Dave Hansen
@ 2021-09-13 15:14         ` Paolo Bonzini
  2021-09-13 15:29           ` Dave Hansen
  2021-09-13 21:13           ` Jarkko Sakkinen
  2021-09-13 21:12         ` Jarkko Sakkinen
  1 sibling, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 15:14 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 13/09/21 16:55, Dave Hansen wrote:
>> By "Windows startup" I mean even after guest reboot.  Because another
>> process could sneak in and steal your EPC pages between a close() and an
>> open(), I'd like to have a way to EREMOVE the pages while keeping them
>> assigned to the specific vEPC instance, i.e.*without*  going through
>> sgx_vepc_free_page().
> Oh, so you want fresh EPC state for the guest, but you're concerned that
> the previous guest might have left them in a bad state.  The current
> method of getting a new vepc instance (which guarantees fresh state) has
> some other downsides.
> 
> Can't another process steal pages via sgxd and reclaim at any time?

vEPC pages never call sgx_mark_page_reclaimable, don't they?

> What's the extra concern here about going through a close()/open()
> cycle?  Performance?

Apart from reclaiming, /dev/sgx_vepc might disappear between the first 
open() and subsequent ones.

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 15:14         ` Paolo Bonzini
@ 2021-09-13 15:29           ` Dave Hansen
  2021-09-13 18:35             ` Paolo Bonzini
  2021-09-13 21:13           ` Jarkko Sakkinen
  1 sibling, 1 reply; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 15:29 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 9/13/21 8:14 AM, Paolo Bonzini wrote:
> On 13/09/21 16:55, Dave Hansen wrote:
>>> By "Windows startup" I mean even after guest reboot.  Because another
>>> process could sneak in and steal your EPC pages between a close() and an
>>> open(), I'd like to have a way to EREMOVE the pages while keeping them
>>> assigned to the specific vEPC instance, i.e.*without*  going through
>>> sgx_vepc_free_page().
>> Oh, so you want fresh EPC state for the guest, but you're concerned that
>> the previous guest might have left them in a bad state.  The current
>> method of getting a new vepc instance (which guarantees fresh state) has
>> some other downsides.
>>
>> Can't another process steal pages via sgxd and reclaim at any time?
> 
> vEPC pages never call sgx_mark_page_reclaimable, don't they?

Oh, I was just looking that they were on the SGX LRU.  You might be right.

But, we certainly don't want the fact that they are unreclaimable today
to be part of the ABI.  It's more of a bug than a feature.

>> What's the extra concern here about going through a close()/open()
>> cycle?  Performance?
> 
> Apart from reclaiming, /dev/sgx_vepc might disappear between the first
> open() and subsequent ones.

Aside from it being rm'd, I don't think that's possible.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 15:29           ` Dave Hansen
@ 2021-09-13 18:35             ` Paolo Bonzini
  2021-09-13 19:25               ` Dave Hansen
  2021-09-13 21:15               ` Jarkko Sakkinen
  0 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-13 18:35 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 13/09/21 17:29, Dave Hansen wrote:
> On 9/13/21 8:14 AM, Paolo Bonzini wrote:
>> On 13/09/21 16:55, Dave Hansen wrote:
>>>> By "Windows startup" I mean even after guest reboot.  Because another
>>>> process could sneak in and steal your EPC pages between a close() and an
>>>> open(), I'd like to have a way to EREMOVE the pages while keeping them
>>>> assigned to the specific vEPC instance, i.e.*without*  going through
>>>> sgx_vepc_free_page().
>>> Oh, so you want fresh EPC state for the guest, but you're concerned that
>>> the previous guest might have left them in a bad state.  The current
>>> method of getting a new vepc instance (which guarantees fresh state) has
>>> some other downsides.
>>>
>>> Can't another process steal pages via sgxd and reclaim at any time?
>>
>> vEPC pages never call sgx_mark_page_reclaimable, don't they?
> 
> Oh, I was just looking that they were on the SGX LRU.  You might be right.
> But, we certainly don't want the fact that they are unreclaimable today
> to be part of the ABI.  It's more of a bug than a feature.

Sure, that's fine.

>>> What's the extra concern here about going through a close()/open()
>>> cycle?  Performance?
>>
>> Apart from reclaiming, /dev/sgx_vepc might disappear between the first
>> open() and subsequent ones.
> 
> Aside from it being rm'd, I don't think that's possible.
> 

Being rm'd would be a possibility in principle, and it would be ugly for 
it to cause issues on running VMs.  Also I'd like for it to be able to 
pass /dev/sgx_vepc in via a file descriptor, and run QEMU in a chroot or 
a mount namespace.  Alternatively, with seccomp it may be possible to 
sandbox a running QEMU process in such a way that open() is forbidden at 
runtime (all hotplug is done via file descriptor passing); it is not yet 
possible, but it is a goal.

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 18:35             ` Paolo Bonzini
@ 2021-09-13 19:25               ` Dave Hansen
  2021-09-13 21:16                 ` Jarkko Sakkinen
  2021-09-13 21:15               ` Jarkko Sakkinen
  1 sibling, 1 reply; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 19:25 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 9/13/21 11:35 AM, Paolo Bonzini wrote:
>>> Apart from reclaiming, /dev/sgx_vepc might disappear between the first
>>> open() and subsequent ones.
>>
>> Aside from it being rm'd, I don't think that's possible.
>>
> 
> Being rm'd would be a possibility in principle, and it would be ugly for
> it to cause issues on running VMs.  Also I'd like for it to be able to
> pass /dev/sgx_vepc in via a file descriptor, and run QEMU in a chroot or
> a mount namespace.  Alternatively, with seccomp it may be possible to
> sandbox a running QEMU process in such a way that open() is forbidden at
> runtime (all hotplug is done via file descriptor passing); it is not yet
> possible, but it is a goal.

OK, so maybe another way of saying this:

For bare-metal SGX on real hardware, the hardware provides guarantees
SGX state at reboot.  For instance, all pages start out uninitialized.
The vepc driver provides a similar guarantee today for freshly-opened
vepc instances.

But, vepc users have a problem: they might want to run an OS that
expects to be booted with clean, fully uninitialized SGX state, just as
it would be on bare-metal.  Right now, the only way to get that is to
create a new vepc instance.  That might not be possible in all
configurations, like if the permission to open(/dev/sgx_vepc) has been
lost since the VM was first booted.

Windows has these expectations about booting with fully uninitialized
state.  There are also a number of environments where QEMU is sandboxed
or drops permissions in a way that prevent free and open access to
/dev/sgx_vepc.

So good so far?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
  2021-09-13 13:11 ` [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl Paolo Bonzini
@ 2021-09-13 19:33   ` Dave Hansen
  2021-09-13 21:11     ` Sean Christopherson
  2021-09-14 10:55   ` Kai Huang
  1 sibling, 1 reply; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 19:33 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On 9/13/21 6:11 AM, Paolo Bonzini wrote:
> +static long sgx_vepc_remove_all(struct sgx_vepc *vepc)
> +{
> +	struct sgx_epc_page *entry;
> +	unsigned long index;
> +	long failures = 0;
> +
> +	xa_for_each(&vepc->page_array, index, entry)
> +		if (sgx_vepc_remove_page(entry))
> +			failures++;
> +
> +	/*
> +	 * Return the number of pages that failed to be removed, so
> +	 * userspace knows that there are still SECS pages lying
> +	 * around.
> +	 */
> +	return failures;
> +}

I'm not sure the retry logic should be in userspace.  Also, is this
strictly limited to SECS pages?  It could also happen if there were
enclaves running that used the page.  Granted, userspace can probably
stop all the vcpus, but the *interface* doesn't prevent it being called
like that.

What else can userspace do but:

	ret = ioctl(fd, SGX_IOC_VEPC_REMOVE);
	if (ret)
		ret = ioctl(fd, SGX_IOC_VEPC_REMOVE);
	if (ret)
		printf("uh oh\n");

We already have existing code to gather up the pages that couldn't be
EREMOVE'd and selectively EREMOVE them again.  Why not reuse that code
here?  If there is 100GB of EPC, it's gotta be painful to run through
the ->page_array twice when once plus a small list iteration will do.

Which reminds me...  Do we need a cond_resched() in there or anything?
That loop could theoretically get really, really long.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
  2021-09-13 14:05   ` Dave Hansen
@ 2021-09-13 20:33   ` Jarkko Sakkinen
  1 sibling, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 20:33 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 09:11 -0400, Paolo Bonzini wrote:
> Windows expects all pages to be in uninitialized state on startup.
> In order to implement this, we will need a ioctl that performs
> EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor:
> other possibilities, such as closing and reopening the device,
> are racy.

So what makes it racy, and what do mean by racy in this case?

I.e. you have to do open() and mmap(), and munmap() + close()
for removal. Depending on situation that is racy or not...

And is "Windows" more precisely a "Windows guest running in
Linux QEMU host"? It's ambiguous..

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 14:24     ` Paolo Bonzini
  2021-09-13 14:55       ` Dave Hansen
@ 2021-09-13 21:00       ` Jarkko Sakkinen
  1 sibling, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 21:00 UTC (permalink / raw)
  To: Paolo Bonzini, Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 16:24 +0200, Paolo Bonzini wrote:
> On 13/09/21 16:05, Dave Hansen wrote:
> > On 9/13/21 6:11 AM, Paolo Bonzini wrote:
> > > Windows expects all pages to be in uninitialized state on startup.
> > > In order to implement this, we will need a ioctl that performs
> > > EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor:
> > > other possibilities, such as closing and reopening the device,
> > > are racy.
> > 
> > Hi Paolo,
> > 
> > How does this end up happening in the first place?
> > 
> > All enclave pages should start out on 'sgx_dirty_page_list' and
> > ksgxd sanitizes them with EREMOVE before making them available.  That
> > should cover EREMOVE after reboots while SGX pages are initialized,
> > including kexec().
> 
> By "Windows startup" I mean even after guest reboot.  Because another 
> process could sneak in and steal your EPC pages between a close() and an 
> open(), I'd like to have a way to EREMOVE the pages while keeping them 
> assigned to the specific vEPC instance, i.e. *without* going through 
> sgx_vepc_free_page().

Isn't "other process in and steal your EPC pages" more like sysadmin
problem, rather than software?

I'm lacking of understanding what would be the collateral damage in
the end.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
  2021-09-13 19:33   ` Dave Hansen
@ 2021-09-13 21:11     ` Sean Christopherson
  2021-09-13 22:43       ` Dave Hansen
  0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2021-09-13 21:11 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Paolo Bonzini, linux-kernel, kvm, x86, linux-sgx, jarkko,
	dave.hansen, yang.zhong

On Mon, Sep 13, 2021, Dave Hansen wrote:
> On 9/13/21 6:11 AM, Paolo Bonzini wrote:
> > +static long sgx_vepc_remove_all(struct sgx_vepc *vepc)
> > +{
> > +	struct sgx_epc_page *entry;
> > +	unsigned long index;
> > +	long failures = 0;
> > +
> > +	xa_for_each(&vepc->page_array, index, entry)
> > +		if (sgx_vepc_remove_page(entry))
> > +			failures++;
> > +
> > +	/*
> > +	 * Return the number of pages that failed to be removed, so
> > +	 * userspace knows that there are still SECS pages lying
> > +	 * around.
> > +	 */
> > +	return failures;
> > +}
> 
> I'm not sure the retry logic should be in userspace.  Also, is this
> strictly limited to SECS pages?  It could also happen if there were
> enclaves running that used the page.  Granted, userspace can probably
> stop all the vcpus, but the *interface* doesn't prevent it being called
> like that.

The general rule for KVM is that so long as userspace abuse of running vCPUs (or
other concurrent operations) doesn't put the kernel/platform at risk, it's
userspace's responsibility to not screw up.  The main argument being that there
are myriad ways the VMM can DoS the guest without having to abuse an ioctl().

> What else can userspace do but:
> 
> 	ret = ioctl(fd, SGX_IOC_VEPC_REMOVE);
> 	if (ret)
> 		ret = ioctl(fd, SGX_IOC_VEPC_REMOVE);
> 	if (ret)
> 		printf("uh oh\n");
> 
> We already have existing code to gather up the pages that couldn't be
> EREMOVE'd and selectively EREMOVE them again.  Why not reuse that code
> here?  If there is 100GB of EPC, it's gotta be painful to run through
> the ->page_array twice when once plus a small list iteration will do.

My argument against handling this fully in the kernel is that to handle a vNUMA
setup with multiple vEPC sections, the ioctl() would need to a take a set of file
descriptors to handle the case where an SECS is pinned by a child page in a
diferent vEPC.  It would also require an extra list_head per page (or dynamic
allocations), as the list_head in sgx_epc_page will (likely, eventually) be needed
to handle EPC OOM.  In the teardown case, sgx_epc_page.list can be used because
the pages are taken off the active/allocated list as part of teardown.

Neither issue is the end of the world, but IMO it's not worth burning the memory
and taking on extra complexity in the kernel for a relatively rare operation that's
slow as dirt anyways.

> Which reminds me...  Do we need a cond_resched() in there or anything?
> That loop could theoretically get really, really long.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 14:55       ` Dave Hansen
  2021-09-13 15:14         ` Paolo Bonzini
@ 2021-09-13 21:12         ` Jarkko Sakkinen
  1 sibling, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 21:12 UTC (permalink / raw)
  To: Dave Hansen, Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 07:55 -0700, Dave Hansen wrote:
> On 9/13/21 7:24 AM, Paolo Bonzini wrote:
> > > How does this end up happening in the first place?
> > > 
> > > All enclave pages should start out on 'sgx_dirty_page_list' and
> > > ksgxd sanitizes them with EREMOVE before making them available.  That
> > > should cover EREMOVE after reboots while SGX pages are initialized,
> > > including kexec().
> > 
> > By "Windows startup" I mean even after guest reboot.  Because another
> > process could sneak in and steal your EPC pages between a close() and an
> > open(), I'd like to have a way to EREMOVE the pages while keeping them
> > assigned to the specific vEPC instance, i.e. *without* going through
> > sgx_vepc_free_page().
> 
> Oh, so you want fresh EPC state for the guest, but you're concerned that
> the previous guest might have left them in a bad state.  The current
> method of getting a new vepc instance (which guarantees fresh state) has
> some other downsides.
> 
> Can't another process steal pages via sgxd and reclaim at any time?
> What's the extra concern here about going through a close()/open()
> cycle?  Performance?

sgxd does not steal anything from vepc regions.

They are not part of the reclaiming process.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 15:14         ` Paolo Bonzini
  2021-09-13 15:29           ` Dave Hansen
@ 2021-09-13 21:13           ` Jarkko Sakkinen
  2021-09-14  5:36             ` Paolo Bonzini
  1 sibling, 1 reply; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 21:13 UTC (permalink / raw)
  To: Paolo Bonzini, Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 17:14 +0200, Paolo Bonzini wrote:
> On 13/09/21 16:55, Dave Hansen wrote:
> > > By "Windows startup" I mean even after guest reboot.  Because another
> > > process could sneak in and steal your EPC pages between a close() and an
> > > open(), I'd like to have a way to EREMOVE the pages while keeping them
> > > assigned to the specific vEPC instance, i.e.*without*  going through
> > > sgx_vepc_free_page().
> > Oh, so you want fresh EPC state for the guest, but you're concerned that
> > the previous guest might have left them in a bad state.  The current
> > method of getting a new vepc instance (which guarantees fresh state) has
> > some other downsides.
> > 
> > Can't another process steal pages via sgxd and reclaim at any time?
> 
> vEPC pages never call sgx_mark_page_reclaimable, don't they?
> 
> > What's the extra concern here about going through a close()/open()
> > cycle?  Performance?
> 
> Apart from reclaiming, /dev/sgx_vepc might disappear between the first 
> open() and subsequent ones.

If /dev/sgx_vepc dissapears, why is it a problem *for the software*, and
not a sysadmin problem?

I think that this is what the whole patch is lacking, why are we talking
about a software problem...

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 18:35             ` Paolo Bonzini
  2021-09-13 19:25               ` Dave Hansen
@ 2021-09-13 21:15               ` Jarkko Sakkinen
  1 sibling, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 21:15 UTC (permalink / raw)
  To: Paolo Bonzini, Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 20:35 +0200, Paolo Bonzini wrote:
> On 13/09/21 17:29, Dave Hansen wrote:
> > On 9/13/21 8:14 AM, Paolo Bonzini wrote:
> > > On 13/09/21 16:55, Dave Hansen wrote:
> > > > > By "Windows startup" I mean even after guest reboot.  Because another
> > > > > process could sneak in and steal your EPC pages between a close() and an
> > > > > open(), I'd like to have a way to EREMOVE the pages while keeping them
> > > > > assigned to the specific vEPC instance, i.e.*without*  going through
> > > > > sgx_vepc_free_page().
> > > > Oh, so you want fresh EPC state for the guest, but you're concerned that
> > > > the previous guest might have left them in a bad state.  The current
> > > > method of getting a new vepc instance (which guarantees fresh state) has
> > > > some other downsides.
> > > > 
> > > > Can't another process steal pages via sgxd and reclaim at any time?
> > > 
> > > vEPC pages never call sgx_mark_page_reclaimable, don't they?
> > 
> > Oh, I was just looking that they were on the SGX LRU.  You might be right.
> > But, we certainly don't want the fact that they are unreclaimable today
> > to be part of the ABI.  It's more of a bug than a feature.
> 
> Sure, that's fine.
> 
> > > > What's the extra concern here about going through a close()/open()
> > > > cycle?  Performance?
> > > 
> > > Apart from reclaiming, /dev/sgx_vepc might disappear between the first
> > > open() and subsequent ones.
> > 
> > Aside from it being rm'd, I don't think that's possible.
> > 
> 
> Being rm'd would be a possibility in principle, and it would be ugly for 
> it to cause issues on running VMs.  Also I'd like for it to be able to 
> pass /dev/sgx_vepc in via a file descriptor, and run QEMU in a chroot or 
> a mount namespace.  Alternatively, with seccomp it may be possible to 
> sandbox a running QEMU process in such a way that open() is forbidden at 
> runtime (all hotplug is done via file descriptor passing); it is not yet 
> possible, but it is a goal.

AFAIK, as long you have open files for a device, they work
as expected.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 19:25               ` Dave Hansen
@ 2021-09-13 21:16                 ` Jarkko Sakkinen
  0 siblings, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-13 21:16 UTC (permalink / raw)
  To: Dave Hansen, Paolo Bonzini, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Mon, 2021-09-13 at 12:25 -0700, Dave Hansen wrote:
> On 9/13/21 11:35 AM, Paolo Bonzini wrote:
> > > > Apart from reclaiming, /dev/sgx_vepc might disappear between the first
> > > > open() and subsequent ones.
> > > 
> > > Aside from it being rm'd, I don't think that's possible.
> > > 
> > 
> > Being rm'd would be a possibility in principle, and it would be ugly for
> > it to cause issues on running VMs.  Also I'd like for it to be able to
> > pass /dev/sgx_vepc in via a file descriptor, and run QEMU in a chroot or
> > a mount namespace.  Alternatively, with seccomp it may be possible to
> > sandbox a running QEMU process in such a way that open() is forbidden at
> > runtime (all hotplug is done via file descriptor passing); it is not yet
> > possible, but it is a goal.
> 
> OK, so maybe another way of saying this:
> 
> For bare-metal SGX on real hardware, the hardware provides guarantees
> SGX state at reboot.  For instance, all pages start out uninitialized.
> The vepc driver provides a similar guarantee today for freshly-opened
> vepc instances.
> 
> But, vepc users have a problem: they might want to run an OS that
> expects to be booted with clean, fully uninitialized SGX state, just as
> it would be on bare-metal.  Right now, the only way to get that is to
> create a new vepc instance.  That might not be possible in all
> configurations, like if the permission to open(/dev/sgx_vepc) has been
> lost since the VM was first booted.

So you maintain your systems in a way that this does not happen?

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
  2021-09-13 21:11     ` Sean Christopherson
@ 2021-09-13 22:43       ` Dave Hansen
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Hansen @ 2021-09-13 22:43 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, linux-kernel, kvm, x86, linux-sgx, jarkko,
	dave.hansen, yang.zhong

On 9/13/21 2:11 PM, Sean Christopherson wrote:
> My argument against handling this fully in the kernel is that to handle a vNUMA
> setup with multiple vEPC sections, the ioctl() would need to a take a set of file
> descriptors to handle the case where an SECS is pinned by a child page in a
> diferent vEPC.

Bah, I'm always forgetting about the multiple vepc fd's case.

I completely agree that there's no sane way to do this with a per-vepc
ioctl() when the EREMOVE failures can originate from other vepc instances.

The only other possible thing would be keep an mm_list for vepc
instances and have this ioctl() (or another interface) blast them all.
But that's going to be a heck of a lot more complicated than this is.

OK... you two are wearing me down on this one.

Let's just get this all documented in the changelogs, especially the
retry behavior.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-13 21:13           ` Jarkko Sakkinen
@ 2021-09-14  5:36             ` Paolo Bonzini
  2021-09-14 16:05               ` Jarkko Sakkinen
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-14  5:36 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On 13/09/21 23:13, Jarkko Sakkinen wrote:
>> Apart from reclaiming, /dev/sgx_vepc might disappear between the first
>> open() and subsequent ones.
>
> If /dev/sgx_vepc disappears, why is it a problem *for the software*, and
> not a sysadmin problem?

Rather than disappearing, it could be that a program first gets all the 
resources it needs before it gets malicious input, and then enter a 
restrictive sandbox.  In this case open() could be completely forbidden.

I will improve the documentation and changelogs when I post the non-RFC 
version; that could have been done better, sorry.

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-13 13:11 [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Paolo Bonzini
  2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
  2021-09-13 13:11 ` [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl Paolo Bonzini
@ 2021-09-14  7:10 ` Yang Zhong
  2021-09-14 10:19   ` Paolo Bonzini
  2 siblings, 1 reply; 29+ messages in thread
From: Yang Zhong @ 2021-09-14  7:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On Mon, Sep 13, 2021 at 09:11:51AM -0400, Paolo Bonzini wrote:
> Based on discussions from the previous week(end), this series implements
> a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc
> file descriptor.  Other possibilities, such as closing and reopening
> the device, are racy.
> 
> The patches are untested, but I am posting them because they are simple
> and so that Yang Zhong can try using them in QEMU.
> 

  Paolo, i re-implemented one reset patch in the Qemu side to call this ioctl(),
  and did some tests on Windows and Linux guest, the Windows/Linux guest reboot 
  work well.

  So, it is time for me to send this reset patch to Qemu community? or wait for
  this kernel patchset merged? thanks! 
     
  Yang


> Paolo
> 
> Paolo Bonzini (2):
>   x86: sgx_vepc: extract sgx_vepc_remove_page
>   x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
> 
>  arch/x86/include/uapi/asm/sgx.h |  2 ++
>  arch/x86/kernel/cpu/sgx/virt.c  | 48 ++++++++++++++++++++++++++++++---
>  2 files changed, 47 insertions(+), 3 deletions(-)
> 
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14  7:10 ` [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Yang Zhong
@ 2021-09-14 10:19   ` Paolo Bonzini
  2021-09-14 16:42     ` Jarkko Sakkinen
  2021-09-15  8:28     ` Yang Zhong
  0 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-14 10:19 UTC (permalink / raw)
  To: Yang Zhong; +Cc: linux-kernel, kvm, x86, linux-sgx, jarkko, dave.hansen

On 14/09/21 09:10, Yang Zhong wrote:
> On Mon, Sep 13, 2021 at 09:11:51AM -0400, Paolo Bonzini wrote:
>> Based on discussions from the previous week(end), this series implements
>> a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc
>> file descriptor.  Other possibilities, such as closing and reopening
>> the device, are racy.
>>
>> The patches are untested, but I am posting them because they are simple
>> and so that Yang Zhong can try using them in QEMU.
>>
> 
>    Paolo, i re-implemented one reset patch in the Qemu side to call this ioctl(),
>    and did some tests on Windows and Linux guest, the Windows/Linux guest reboot
>    work well.
> 
>    So, it is time for me to send this reset patch to Qemu community? or wait for
>    this kernel patchset merged? thanks!

Let's wait for this patch to be accepted first.  I'll wait a little more 
for Jarkko and Dave to comment on this, and include your "Tested-by".

I will also add cond_resched() on the final submission.

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl
  2021-09-13 13:11 ` [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl Paolo Bonzini
  2021-09-13 19:33   ` Dave Hansen
@ 2021-09-14 10:55   ` Kai Huang
  1 sibling, 0 replies; 29+ messages in thread
From: Kai Huang @ 2021-09-14 10:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On Mon, 13 Sep 2021 09:11:53 -0400 Paolo Bonzini wrote:
> Windows expects all pages to be in uninitialized state on startup.
> Add a ioctl that does this with EREMOVE, so that userspace can bring
> the pages back to this state also when resetting the VM.
> Pure userspace implementations, such as closing and reopening the device,
> are racy.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/uapi/asm/sgx.h |  2 ++
>  arch/x86/kernel/cpu/sgx/virt.c  | 36 +++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index 9690d6899ad9..f79d84ce8033 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -27,6 +27,8 @@ enum sgx_page_flags {
>  	_IOW(SGX_MAGIC, 0x02, struct sgx_enclave_init)
>  #define SGX_IOC_ENCLAVE_PROVISION \
>  	_IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> +#define SGX_IOC_VEPC_REMOVE \
> +	_IO(SGX_MAGIC, 0x04)

Perhaps SGX_IOC_VEPC_RESET is better than REMOVE, since this ioctl doesn't
actually remove any EPC page from virtual EPC device, but just reset to a clean
slate (by using EREMOVE).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page
  2021-09-14  5:36             ` Paolo Bonzini
@ 2021-09-14 16:05               ` Jarkko Sakkinen
  0 siblings, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-14 16:05 UTC (permalink / raw)
  To: Paolo Bonzini, Dave Hansen, linux-kernel, kvm
  Cc: x86, linux-sgx, dave.hansen, yang.zhong

On Tue, 2021-09-14 at 07:36 +0200, Paolo Bonzini wrote:
> On 13/09/21 23:13, Jarkko Sakkinen wrote:
> > > Apart from reclaiming, /dev/sgx_vepc might disappear between the first
> > > open() and subsequent ones.
> > 
> > If /dev/sgx_vepc disappears, why is it a problem *for the software*, and
> > not a sysadmin problem?
> 
> Rather than disappearing, it could be that a program first gets all the 
> resources it needs before it gets malicious input, and then enter a 
> restrictive sandbox.  In this case open() could be completely forbidden.
> 
> I will improve the documentation and changelogs when I post the non-RFC 
> version; that could have been done better, sorry.
> 

No worries, just trying to get bottom of the actual issue.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14 10:19   ` Paolo Bonzini
@ 2021-09-14 16:42     ` Jarkko Sakkinen
  2021-09-14 17:07       ` Paolo Bonzini
  2021-09-15  8:28     ` Yang Zhong
  1 sibling, 1 reply; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-14 16:42 UTC (permalink / raw)
  To: Paolo Bonzini, Yang Zhong; +Cc: linux-kernel, kvm, x86, linux-sgx, dave.hansen

On Tue, 2021-09-14 at 12:19 +0200, Paolo Bonzini wrote:
> On 14/09/21 09:10, Yang Zhong wrote:
> > On Mon, Sep 13, 2021 at 09:11:51AM -0400, Paolo Bonzini wrote:
> > > Based on discussions from the previous week(end), this series implements
> > > a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc
> > > file descriptor.  Other possibilities, such as closing and reopening
> > > the device, are racy.
> > > 
> > > The patches are untested, but I am posting them because they are simple
> > > and so that Yang Zhong can try using them in QEMU.
> > > 
> > 
> >    Paolo, i re-implemented one reset patch in the Qemu side to call this ioctl(),
> >    and did some tests on Windows and Linux guest, the Windows/Linux guest reboot
> >    work well.
> > 
> >    So, it is time for me to send this reset patch to Qemu community? or wait for
> >    this kernel patchset merged? thanks!
> 
> Let's wait for this patch to be accepted first.  I'll wait a little more 
> for Jarkko and Dave to comment on this, and include your "Tested-by".
> 
> I will also add cond_resched() on the final submission.

Why these would be conflicting tasks? I.e. why could not QEMU use
what is available now and move forward using better mechanism, when
they are available?

BTW, I do all my SGX testing ATM in QEMU (for some weeks). IMHO, it's
already "good enough" for many tasks, even if this fallback case is
not perfectly sorted out.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14 16:42     ` Jarkko Sakkinen
@ 2021-09-14 17:07       ` Paolo Bonzini
  2021-09-14 17:40         ` Jarkko Sakkinen
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2021-09-14 17:07 UTC (permalink / raw)
  To: Jarkko Sakkinen, Yang Zhong
  Cc: linux-kernel, kvm, x86, linux-sgx, dave.hansen

On 14/09/21 18:42, Jarkko Sakkinen wrote:
>> Let's wait for this patch to be accepted first.  I'll wait a little more
>> for Jarkko and Dave to comment on this, and include your "Tested-by".
>>
>> I will also add cond_resched() on the final submission.
> Why these would be conflicting tasks? I.e. why could not QEMU use
> what is available now and move forward using better mechanism, when
> they are available?

The implementation using close/open is quite ugly (destroying and 
recreating the memory block breaks a few levels of abstractions), so 
it's not really something I'd like to commit.

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14 17:07       ` Paolo Bonzini
@ 2021-09-14 17:40         ` Jarkko Sakkinen
  2021-09-14 17:44           ` Jarkko Sakkinen
  0 siblings, 1 reply; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-14 17:40 UTC (permalink / raw)
  To: Paolo Bonzini, Yang Zhong; +Cc: linux-kernel, kvm, x86, linux-sgx, dave.hansen

On Tue, 2021-09-14 at 19:07 +0200, Paolo Bonzini wrote:
> On 14/09/21 18:42, Jarkko Sakkinen wrote:
> > > Let's wait for this patch to be accepted first.  I'll wait a little more
> > > for Jarkko and Dave to comment on this, and include your "Tested-by".
> > > 
> > > I will also add cond_resched() on the final submission.
> > Why these would be conflicting tasks? I.e. why could not QEMU use
> > what is available now and move forward using better mechanism, when
> > they are available?
> 
> The implementation using close/open is quite ugly (destroying and 
> recreating the memory block breaks a few levels of abstractions), so 
> it's not really something I'd like to commit.

OK, so the driving reason for SGX_IOC_VEPC_RESET is the complex dance
with opening, closing and mmapping() stuff, especially when dealing
with multiple sections for one VM? OK, I think I can understand this,
given how notorious it might be to get stable in the user space.

Please just document this use case some way (if I got it right) to
the commit message of the next version, and I think this starts to
make much more sense.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14 17:40         ` Jarkko Sakkinen
@ 2021-09-14 17:44           ` Jarkko Sakkinen
  0 siblings, 0 replies; 29+ messages in thread
From: Jarkko Sakkinen @ 2021-09-14 17:44 UTC (permalink / raw)
  To: Paolo Bonzini, Yang Zhong; +Cc: linux-kernel, kvm, x86, linux-sgx, dave.hansen

On Tue, 2021-09-14 at 20:40 +0300, Jarkko Sakkinen wrote:
> On Tue, 2021-09-14 at 19:07 +0200, Paolo Bonzini wrote:
> > On 14/09/21 18:42, Jarkko Sakkinen wrote:
> > > > Let's wait for this patch to be accepted first.  I'll wait a little more
> > > > for Jarkko and Dave to comment on this, and include your "Tested-by".
> > > > 
> > > > I will also add cond_resched() on the final submission.
> > > Why these would be conflicting tasks? I.e. why could not QEMU use
> > > what is available now and move forward using better mechanism, when
> > > they are available?
> > 
> > The implementation using close/open is quite ugly (destroying and 
> > recreating the memory block breaks a few levels of abstractions), so 
> > it's not really something I'd like to commit.
> 
> OK, so the driving reason for SGX_IOC_VEPC_RESET is the complex dance
> with opening, closing and mmapping() stuff, especially when dealing
> with multiple sections for one VM? OK, I think I can understand this,
> given how notorious it might be to get stable in the user space.
> 
> Please just document this use case some way (if I got it right) to
> the commit message of the next version, and I think this starts to
> make much more sense.

I would call it bottleneck rather than race though. I would keep
race for the things that can cause real race condition inside the
kernel corrupting data structures or whatever.

But for sure it is bottleneck that can easily cause user space to
be racy without extra-ordinary carefulness.

/Jarkko

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages
  2021-09-14 10:19   ` Paolo Bonzini
  2021-09-14 16:42     ` Jarkko Sakkinen
@ 2021-09-15  8:28     ` Yang Zhong
  1 sibling, 0 replies; 29+ messages in thread
From: Yang Zhong @ 2021-09-15  8:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, x86, linux-sgx, jarkko, dave.hansen, yang.zhong

On Tue, Sep 14, 2021 at 12:19:31PM +0200, Paolo Bonzini wrote:
> On 14/09/21 09:10, Yang Zhong wrote:
> >On Mon, Sep 13, 2021 at 09:11:51AM -0400, Paolo Bonzini wrote:
> >>Based on discussions from the previous week(end), this series implements
> >>a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc
> >>file descriptor.  Other possibilities, such as closing and reopening
> >>the device, are racy.
> >>
> >>The patches are untested, but I am posting them because they are simple
> >>and so that Yang Zhong can try using them in QEMU.
> >>
> >
> >   Paolo, i re-implemented one reset patch in the Qemu side to call this ioctl(),
> >   and did some tests on Windows and Linux guest, the Windows/Linux guest reboot
> >   work well.
> >
> >   So, it is time for me to send this reset patch to Qemu community? or wait for
> >   this kernel patchset merged? thanks!
> 
> Let's wait for this patch to be accepted first.  I'll wait a little
> more for Jarkko and Dave to comment on this, and include your
> "Tested-by".
> 
> I will also add cond_resched() on the final submission.
> 

  Thanks Paolo, i will send Qemu patch once this patchset is accepted.

  This day, i also did corner cases test and updated related Qemu reset patch.
   
   do {
       ret = ioctl(fd, SGX_IOC_VEPC_REMOVE);
       /* this printf is only for debug*/
       printf("-------sgx ret=%d and n=%d---\n", ret, n++);
       if(ret)
           sleep(1);
   } while (ret);  

  (1). The VEPC size=10M, start 4 enclaves(each ~2G size) in the VM side.
       then do the 'system_reset' in the Qemu monitor tool.
       
  (2). The VEPC size=10G, start 500 enclaves(each ~20M size) in the VM side.
       then do the 'system_reset' in the Qemu monitor tool.

  The ret will show the failures number(SECS pages number, 4 and 500) got from kernel side,
  after sleep 1s, the ioctl will return 0 failures.

  If this reset is triggered by guest bios, there is 0 SECS page got from kernel, which will
  not block VM booting.

  So, until now, the kernel patches work well. If any new issue, i will update it to all. thanks!      

  Yang

> Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2021-09-15  8:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 13:11 [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Paolo Bonzini
2021-09-13 13:11 ` [PATCH 1/2] x86: sgx_vepc: extract sgx_vepc_remove_page Paolo Bonzini
2021-09-13 14:05   ` Dave Hansen
2021-09-13 14:24     ` Paolo Bonzini
2021-09-13 14:55       ` Dave Hansen
2021-09-13 15:14         ` Paolo Bonzini
2021-09-13 15:29           ` Dave Hansen
2021-09-13 18:35             ` Paolo Bonzini
2021-09-13 19:25               ` Dave Hansen
2021-09-13 21:16                 ` Jarkko Sakkinen
2021-09-13 21:15               ` Jarkko Sakkinen
2021-09-13 21:13           ` Jarkko Sakkinen
2021-09-14  5:36             ` Paolo Bonzini
2021-09-14 16:05               ` Jarkko Sakkinen
2021-09-13 21:12         ` Jarkko Sakkinen
2021-09-13 21:00       ` Jarkko Sakkinen
2021-09-13 20:33   ` Jarkko Sakkinen
2021-09-13 13:11 ` [PATCH 2/2] x86: sgx_vepc: implement SGX_IOC_VEPC_REMOVE ioctl Paolo Bonzini
2021-09-13 19:33   ` Dave Hansen
2021-09-13 21:11     ` Sean Christopherson
2021-09-13 22:43       ` Dave Hansen
2021-09-14 10:55   ` Kai Huang
2021-09-14  7:10 ` [RFC/RFT PATCH 0/2] x86: sgx_vepc: implement ioctl to EREMOVE all pages Yang Zhong
2021-09-14 10:19   ` Paolo Bonzini
2021-09-14 16:42     ` Jarkko Sakkinen
2021-09-14 17:07       ` Paolo Bonzini
2021-09-14 17:40         ` Jarkko Sakkinen
2021-09-14 17:44           ` Jarkko Sakkinen
2021-09-15  8:28     ` Yang Zhong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.