[PATCH] x86/sgx: Free backing memory after faulting the enclave page

linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] x86/sgx: Free backing memory after faulting the enclave page
@ 2021-11-03 23:22 Jarkko Sakkinen
  2021-11-04 13:50 ` Dave Hansen
  2021-11-04 22:38 ` Dave Hansen
  0 siblings, 2 replies; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-03 23:22 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

The backing memory was not freed, after loading it back to the SGX
reserved memory. This problem was not prevalent with the systems that
were available at the time when SGX was first upstreamed, as the swapped
memory could grow only up to 128 MB.  However, Icelake Server can have
gigabytes of SGX memory, and thus this has become a real bottleneck.

Free the swap space for other use by calling shmem_truncate_range(),
when a page is faulted back.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
---
 arch/x86/kernel/cpu/sgx/encl.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..f2d3f2e5028f 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -22,6 +22,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
 {
 	unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
 	struct sgx_encl *encl = encl_page->encl;
+	struct inode *inode = file_inode(encl->backing);
 	struct sgx_pageinfo pginfo;
 	struct sgx_backing b;
 	pgoff_t page_index;
@@ -60,6 +61,9 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
 
 	sgx_encl_put_backing(&b, false);
 
+	/* Free the backing memory. */
+	shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
+
 	return ret;
 }
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-03 23:22 [PATCH] x86/sgx: Free backing memory after faulting the enclave page Jarkko Sakkinen
@ 2021-11-04 13:50 ` Dave Hansen
  2021-11-04 15:04   ` Jarkko Sakkinen
  2021-11-04 22:38 ` Dave Hansen
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-04 13:50 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
> The backing memory was not freed, after loading it back to the SGX
> reserved memory. This problem was not prevalent with the systems that
> were available at the time when SGX was first upstreamed, as the swapped
> memory could grow only up to 128 MB.  However, Icelake Server can have
> gigabytes of SGX memory, and thus this has become a real bottleneck.
> 
> Free the swap space for other use by calling shmem_truncate_range(),
> when a page is faulted back.

This needs a _bit_ more context.  Could you include a few sentences
about what backing memory is?

It's also not a big deal but it is nice to include something like:

	This was found by inspection.

and:

	Reported-by: Dave Hansen <dave.hansen@linux.intel.com>

Also, what is the end-user-visible effect of this bug?  What would a
user see if they were impacted by this?  How did you determine that this
fixed the issue?  This memory will be recovered when the enclave is torn
down, right?

Do we also need to deal with truncating the PCMD?  (For those watching
along at home, there are two things SGX swaps to RAM: the actual page
data and also some metadata that ensures page integrity and helps
prevent things like rolling back to old versions of swapped pages)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 13:50 ` Dave Hansen
@ 2021-11-04 15:04   ` Jarkko Sakkinen
  2021-11-04 15:09     ` Jarkko Sakkinen
  2021-11-04 15:13     ` Dave Hansen
  0 siblings, 2 replies; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-04 15:04 UTC (permalink / raw)
  To: Dave Hansen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Thu, 2021-11-04 at 06:50 -0700, Dave Hansen wrote:
> On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
> > The backing memory was not freed, after loading it back to the SGX
> > reserved memory. This problem was not prevalent with the systems that
> > were available at the time when SGX was first upstreamed, as the swapped
> > memory could grow only up to 128 MB.  However, Icelake Server can have
> > gigabytes of SGX memory, and thus this has become a real bottleneck.
> > 
> > Free the swap space for other use by calling shmem_truncate_range(),
> > when a page is faulted back.
> 
> This needs a _bit_ more context.  Could you include a few sentences
> about what backing memory is?
> 
> It's also not a big deal but it is nice to include something like:
>         
>         This was found by inspection.
> 
> and:
> 
>         Reported-by: Dave Hansen <dave.hansen@linux.intel.com>

Oops, I'm sorry, I was planning to add this but forgot to do it.

> Also, what is the end-user-visible effect of this bug?  What would a
> user see if they were impacted by this?  How did you determine that this
> fixed the issue?  This memory will be recovered when the enclave is torn
> down, right?

You're absolutely right.

I just realized how to make the whole thing concrete and reproduce OOM with the
128 MB EPC that I have in my i5-9600KF desktop. I'll just reconfigure my test
VM to have sufficiently small amount of RAM, and come up with an appropriate
workload.

> Do we also need to deal with truncating the PCMD?  (For those watching
> along at home, there are two things SGX swaps to RAM: the actual page
> data and also some metadata that ensures page integrity and helps
> prevent things like rolling back to old versions of swapped pages)

Yes.

This can be achieved by iterating through all of the enclave pages,
which share the same shmem page for storing their PCMD's, as the one
being faulted back. If none of those pages is swapped, the PCMD page can
safely truncated.

I guess it makes sense to put this into patch of its own.

/Jarkko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 15:04   ` Jarkko Sakkinen
@ 2021-11-04 15:09     ` Jarkko Sakkinen
  2021-11-04 15:13     ` Dave Hansen
  1 sibling, 0 replies; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-04 15:09 UTC (permalink / raw)
  To: Dave Hansen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Thu, 2021-11-04 at 17:04 +0200, Jarkko Sakkinen wrote:
> This can be achieved by iterating through all of the enclave pages,
> which share the same shmem page for storing their PCMD's, as the one
> being faulted back. If none of those pages is swapped, the PCMD page can
> safely truncated.

We have bookkeeping in place for this: encl->page_array.

/Jarkko


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 15:04   ` Jarkko Sakkinen
  2021-11-04 15:09     ` Jarkko Sakkinen
@ 2021-11-04 15:13     ` Dave Hansen
  2021-11-04 15:25       ` Jarkko Sakkinen
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-04 15:13 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/4/21 8:04 AM, Jarkko Sakkinen wrote:
>> Do we also need to deal with truncating the PCMD?  (For those watching
>> along at home, there are two things SGX swaps to RAM: the actual page
>> data and also some metadata that ensures page integrity and helps
>> prevent things like rolling back to old versions of swapped pages)
> Yes.
> 
> This can be achieved by iterating through all of the enclave pages,
> which share the same shmem page for storing their PCMD's, as the one
> being faulted back. If none of those pages is swapped, the PCMD page can
> safely truncated.

I was thinking we could just read the page.  If it's all 0's, truncate it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 15:13     ` Dave Hansen
@ 2021-11-04 15:25       ` Jarkko Sakkinen
  2021-11-04 15:29         ` Dave Hansen
  0 siblings, 1 reply; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-04 15:25 UTC (permalink / raw)
  To: Dave Hansen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Thu, 2021-11-04 at 08:13 -0700, Dave Hansen wrote:
> On 11/4/21 8:04 AM, Jarkko Sakkinen wrote:
> > > Do we also need to deal with truncating the PCMD?  (For those watching
> > > along at home, there are two things SGX swaps to RAM: the actual page
> > > data and also some metadata that ensures page integrity and helps
> > > prevent things like rolling back to old versions of swapped pages)
> > Yes.
> > 
> > This can be achieved by iterating through all of the enclave pages,
> > which share the same shmem page for storing their PCMD's, as the one
> > being faulted back. If none of those pages is swapped, the PCMD page can
> > safely truncated.
> 
> I was thinking we could just read the page.  If it's all 0's, truncate it.

Hmm... did ELDU zero PCMD as a side-effect?

It should be fairly effecient just to check the pages by using
encl->page_tree.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 15:25       ` Jarkko Sakkinen
@ 2021-11-04 15:29         ` Dave Hansen
  2021-11-07 19:42           ` Jarkko Sakkinen
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-04 15:29 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/4/21 8:25 AM, Jarkko Sakkinen wrote:
> On Thu, 2021-11-04 at 08:13 -0700, Dave Hansen wrote:
>> On 11/4/21 8:04 AM, Jarkko Sakkinen wrote:
>>>> Do we also need to deal with truncating the PCMD?  (For those watching
>>>> along at home, there are two things SGX swaps to RAM: the actual page
>>>> data and also some metadata that ensures page integrity and helps
>>>> prevent things like rolling back to old versions of swapped pages)
>>> Yes.
>>>
>>> This can be achieved by iterating through all of the enclave pages,
>>> which share the same shmem page for storing their PCMD's, as the one
>>> being faulted back. If none of those pages is swapped, the PCMD page can
>>> safely truncated.
>> I was thinking we could just read the page.  If it's all 0's, truncate it.
> Hmm... did ELDU zero PCMD as a side-effect?

I don't think so, but there's nothing stopping us from doing it ourselves.

> It should be fairly effecient just to check the pages by using
> encl->page_tree.

That sounds more complicated and slower than what I suggested.  You
could even just check the refcount on the page.  I _think_ page cache
pages have a refcount of 2.  So, look for the refcount that means "no
more PCMD in this page", and just free it if so.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-03 23:22 [PATCH] x86/sgx: Free backing memory after faulting the enclave page Jarkko Sakkinen
  2021-11-04 13:50 ` Dave Hansen
@ 2021-11-04 22:38 ` Dave Hansen
  2021-11-07 18:06   ` Jarkko Sakkinen
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-04 22:38 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Jethro Beekman,
	Sean Christopherson
  Cc: reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -22,6 +22,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  {
>  	unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
>  	struct sgx_encl *encl = encl_page->encl;
> +	struct inode *inode = file_inode(encl->backing);
>  	struct sgx_pageinfo pginfo;
>  	struct sgx_backing b;
>  	pgoff_t page_index;
> @@ -60,6 +61,9 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  
>  	sgx_encl_put_backing(&b, false);
>  
> +	/* Free the backing memory. */
> +	shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
> +
>  	return ret;
>  }

This also misses tearing down the backing storage if it is in place at
sgx_encl_release().

Does a entry->epc_page==NULL page in there guarantee that it has backing
storage?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 22:38 ` Dave Hansen
@ 2021-11-07 18:06   ` Jarkko Sakkinen
  2021-11-07 19:06     ` Dave Hansen
  0 siblings, 1 reply; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-07 18:06 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Thu, Nov 04, 2021 at 03:38:55PM -0700, Dave Hansen wrote:
> On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -22,6 +22,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> >  {
> >  	unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
> >  	struct sgx_encl *encl = encl_page->encl;
> > +	struct inode *inode = file_inode(encl->backing);
> >  	struct sgx_pageinfo pginfo;
> >  	struct sgx_backing b;
> >  	pgoff_t page_index;
> > @@ -60,6 +61,9 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> >  
> >  	sgx_encl_put_backing(&b, false);
> >  
> > +	/* Free the backing memory. */
> > +	shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
> > +
> >  	return ret;
> >  }
> 
> This also misses tearing down the backing storage if it is in place at
> sgx_encl_release().

Hmm... sgx_encl_release() does fput(). Isn't that enough to tear it down,
or does it require explicit truncate, i.e. something like

        shmem_truncate_range(file_inode(encl->backing), encl->base, encl->size - 1);


> Does a entry->epc_page==NULL page in there guarantee that it has backing
> storage?

Yes, it is an invariant. That what I was thinking to use for PCMD: iterate
32 pages and check if they have a faulted page.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 18:06   ` Jarkko Sakkinen
@ 2021-11-07 19:06     ` Dave Hansen
  2021-11-07 19:45       ` Jarkko Sakkinen
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-07 19:06 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/7/21 10:06 AM, Jarkko Sakkinen wrote:
> On Thu, Nov 04, 2021 at 03:38:55PM -0700, Dave Hansen wrote:
>> On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
>>> --- a/arch/x86/kernel/cpu/sgx/encl.c
>>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
>>> @@ -22,6 +22,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>>>  {
>>>  	unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
>>>  	struct sgx_encl *encl = encl_page->encl;
>>> +	struct inode *inode = file_inode(encl->backing);
>>>  	struct sgx_pageinfo pginfo;
>>>  	struct sgx_backing b;
>>>  	pgoff_t page_index;
>>> @@ -60,6 +61,9 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>>>  
>>>  	sgx_encl_put_backing(&b, false);
>>>  
>>> +	/* Free the backing memory. */
>>> +	shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
>>> +
>>>  	return ret;
>>>  }
>>
>> This also misses tearing down the backing storage if it is in place at
>> sgx_encl_release().
> 
> Hmm... sgx_encl_release() does fput(). Isn't that enough to tear it down,
> or does it require explicit truncate, i.e. something like
> 
>         shmem_truncate_range(file_inode(encl->backing), encl->base, encl->size - 1);

That's true, the page cache should all be torn down along with the
fput().  *But*, it would be a very nice property if the backing storage
was empty by this point.  It essentially ensures that no enclave-runtime
cases missed truncating the backing storage away.

>> Does a entry->epc_page==NULL page in there guarantee that it has backing
>> storage?
> 
> Yes, it is an invariant. That what I was thinking to use for PCMD: iterate
> 32 pages and check if they have a faulted page.

I think the rule should be that entry->epc_page==NULL enclave pages have
backing storage.  All entry->epc_page!=NULL do *not* have backing storage.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-04 15:29         ` Dave Hansen
@ 2021-11-07 19:42           ` Jarkko Sakkinen
  2021-11-07 19:51             ` Dave Hansen
  0 siblings, 1 reply; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-07 19:42 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Thu, Nov 04, 2021 at 08:29:49AM -0700, Dave Hansen wrote:
> On 11/4/21 8:25 AM, Jarkko Sakkinen wrote:
> > On Thu, 2021-11-04 at 08:13 -0700, Dave Hansen wrote:
> >> On 11/4/21 8:04 AM, Jarkko Sakkinen wrote:
> >>>> Do we also need to deal with truncating the PCMD?  (For those watching
> >>>> along at home, there are two things SGX swaps to RAM: the actual page
> >>>> data and also some metadata that ensures page integrity and helps
> >>>> prevent things like rolling back to old versions of swapped pages)
> >>> Yes.
> >>>
> >>> This can be achieved by iterating through all of the enclave pages,
> >>> which share the same shmem page for storing their PCMD's, as the one
> >>> being faulted back. If none of those pages is swapped, the PCMD page can
> >>> safely truncated.
> >> I was thinking we could just read the page.  If it's all 0's, truncate it.
> > Hmm... did ELDU zero PCMD as a side-effect?
> 
> I don't think so, but there's nothing stopping us from doing it ourselves.

Ok.

> > It should be fairly effecient just to check the pages by using
> > encl->page_tree.
> 
> That sounds more complicated and slower than what I suggested.  You
> could even just check the refcount on the page.  I _think_ page cache
> pages have a refcount of 2.  So, look for the refcount that means "no
> more PCMD in this page", and just free it if so.

Umh, so... there is total 32 PCMD's per one page.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 19:06     ` Dave Hansen
@ 2021-11-07 19:45       ` Jarkko Sakkinen
  2021-11-07 19:58         ` Dave Hansen
  0 siblings, 1 reply; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-07 19:45 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Sun, Nov 07, 2021 at 11:06:01AM -0800, Dave Hansen wrote:
> On 11/7/21 10:06 AM, Jarkko Sakkinen wrote:
> > On Thu, Nov 04, 2021 at 03:38:55PM -0700, Dave Hansen wrote:
> >> On 11/3/21 4:22 PM, Jarkko Sakkinen wrote:
> >>> --- a/arch/x86/kernel/cpu/sgx/encl.c
> >>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> >>> @@ -22,6 +22,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> >>>  {
> >>>  	unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
> >>>  	struct sgx_encl *encl = encl_page->encl;
> >>> +	struct inode *inode = file_inode(encl->backing);
> >>>  	struct sgx_pageinfo pginfo;
> >>>  	struct sgx_backing b;
> >>>  	pgoff_t page_index;
> >>> @@ -60,6 +61,9 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> >>>  
> >>>  	sgx_encl_put_backing(&b, false);
> >>>  
> >>> +	/* Free the backing memory. */
> >>> +	shmem_truncate_range(inode, PFN_PHYS(page_index), PFN_PHYS(page_index) + PAGE_SIZE - 1);
> >>> +
> >>>  	return ret;
> >>>  }
> >>
> >> This also misses tearing down the backing storage if it is in place at
> >> sgx_encl_release().
> > 
> > Hmm... sgx_encl_release() does fput(). Isn't that enough to tear it down,
> > or does it require explicit truncate, i.e. something like
> > 
> >         shmem_truncate_range(file_inode(encl->backing), encl->base, encl->size - 1);
> 
> That's true, the page cache should all be torn down along with the
> fput().  *But*, it would be a very nice property if the backing storage
> was empty by this point.  It essentially ensures that no enclave-runtime
> cases missed truncating the backing storage away.

What if an enclave is released a point when all of its pages
are swapped out? Or even simpler case would an enclave that is
larger than all of EPC.

What can be made sure is that for all pages, which are in EPC,
the backing page is truncated.

> >> Does a entry->epc_page==NULL page in there guarantee that it has backing
> >> storage?
> > 
> > Yes, it is an invariant. That what I was thinking to use for PCMD: iterate
> > 32 pages and check if they have a faulted page.
> 
> I think the rule should be that entry->epc_page==NULL enclave pages have
> backing storage.  All entry->epc_page!=NULL do *not* have backing storage.

Yes, that is the goal of this patch.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 19:42           ` Jarkko Sakkinen
@ 2021-11-07 19:51             ` Dave Hansen
  2021-11-07 22:28               ` Jarkko Sakkinen
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-07 19:51 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/7/21 11:42 AM, Jarkko Sakkinen wrote:
>>> It should be fairly effecient just to check the pages by using
>>> encl->page_tree.
>> That sounds more complicated and slower than what I suggested.  You
>> could even just check the refcount on the page.  I _think_ page cache
>> pages have a refcount of 2.  So, look for the refcount that means "no
>> more PCMD in this page", and just free it if so.
> Umh, so... there is total 32 PCMD's per one page.

When you place PCMD in a page, you do a get_page().  The refcount goes
up by one.  So, a PCMD page with one PCMD will (I think) have a refcount
of 3.  If you totally fill it up with 31 *more* PCMD entries, it will
have a refcount of 34.  You do *not* do a put_page() on the PCMD page at
the end of the allocation phase.

When the backing storage is freed, you drop the refcount.  So, going
from 32 PCMD entries to 31 entries in a page, you go from 34->33.

When that refcount drops to 2, you know there is no more data in the
page that's useful.  At that point you can truncate it out of the
backing storage.

There's no reason to scan the page, or a boatload of other metadata.
Just keep a refcount.  Just use the *existing* 'struct page' refcount.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 19:45       ` Jarkko Sakkinen
@ 2021-11-07 19:58         ` Dave Hansen
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Hansen @ 2021-11-07 19:58 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/7/21 11:45 AM, Jarkko Sakkinen wrote:
> On Sun, Nov 07, 2021 at 11:06:01AM -0800, Dave Hansen wrote:
>> That's true, the page cache should all be torn down along with the
>> fput().  *But*, it would be a very nice property if the backing storage
>> was empty by this point.  It essentially ensures that no enclave-runtime
>> cases missed truncating the backing storage away.
> 
> What if an enclave is released a point when all of its pages
> are swapped out? Or even simpler case would an enclave that is
> larger than all of EPC.

In this loop:

void sgx_encl_release(struct kref *ref)
{
	...
        xa_for_each(&encl->page_array, index, entry) {
		if (entry->epc_page == NULL)
			// truncate backing storage

> What can be made sure is that for all pages, which are in EPC,
> the backing page is truncated.

Right, and that should be utterly trivial to do.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 19:51             ` Dave Hansen
@ 2021-11-07 22:28               ` Jarkko Sakkinen
  2021-11-08  6:34                 ` Dave Hansen
  0 siblings, 1 reply; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-07 22:28 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Sun, 2021-11-07 at 11:51 -0800, Dave Hansen wrote:
> On 11/7/21 11:42 AM, Jarkko Sakkinen wrote:
> > > > It should be fairly effecient just to check the pages by using
> > > > encl->page_tree.
> > > That sounds more complicated and slower than what I suggested.  You
> > > could even just check the refcount on the page.  I _think_ page cache
> > > pages have a refcount of 2.  So, look for the refcount that means "no
> > > more PCMD in this page", and just free it if so.
> > Umh, so... there is total 32 PCMD's per one page.
> 
> When you place PCMD in a page, you do a get_page().  The refcount goes
> up by one.  So, a PCMD page with one PCMD will (I think) have a refcount
> of 3.  If you totally fill it up with 31 *more* PCMD entries, it will
> have a refcount of 34.  You do *not* do a put_page() on the PCMD page at
> the end of the allocation phase.
> 
> When the backing storage is freed, you drop the refcount.  So, going
> from 32 PCMD entries to 31 entries in a page, you go from 34->33.
> 
> When that refcount drops to 2, you know there is no more data in the
> page that's useful.  At that point you can truncate it out of the
> backing storage.
> 
> There's no reason to scan the page, or a boatload of other metadata.
> Just keep a refcount.  Just use the *existing* 'struct page' refcount.

Right! Thank you, I'll use this approach, and check that the refcount
actually behaves that way you described.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-07 22:28               ` Jarkko Sakkinen
@ 2021-11-08  6:34                 ` Dave Hansen
  2021-11-08 20:07                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2021-11-08  6:34 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On 11/7/21 2:28 PM, Jarkko Sakkinen wrote:
>> When you place PCMD in a page, you do a get_page().  The refcount goes
>> up by one.  So, a PCMD page with one PCMD will (I think) have a refcount
>> of 3.  If you totally fill it up with 31 *more* PCMD entries, it will
>> have a refcount of 34.  You do *not* do a put_page() on the PCMD page at
>> the end of the allocation phase.
>>
>> When the backing storage is freed, you drop the refcount.  So, going
>> from 32 PCMD entries to 31 entries in a page, you go from 34->33.
>>
>> When that refcount drops to 2, you know there is no more data in the
>> page that's useful.  At that point you can truncate it out of the
>> backing storage.
>>
>> There's no reason to scan the page, or a boatload of other metadata.
>> Just keep a refcount.  Just use the *existing* 'struct page' refcount.
> Right! Thank you, I'll use this approach, and check that the refcount
> actually behaves that way you described.

Thinking about this a bit more...  We don't want to use the normal
get/put_page() refcount for this.  If we do, it will pin the page while
there is any data in it, preventing it from being reclaimed (swapped).

That isn't to say that we can't keep *a* refcount, just that we can't
use the page refcount for this.

I still like the idea of just scanning the whole page for zeros.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/sgx: Free backing memory after faulting the enclave page
  2021-11-08  6:34                 ` Dave Hansen
@ 2021-11-08 20:07                   ` Jarkko Sakkinen
  0 siblings, 0 replies; 17+ messages in thread
From: Jarkko Sakkinen @ 2021-11-08 20:07 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Jethro Beekman, Sean Christopherson,
	reinette.chatre, tony.luck, nathaniel, stable, Borislav Petkov,
	linux-sgx, linux-kernel

On Sun, Nov 07, 2021 at 10:34:02PM -0800, Dave Hansen wrote:
> On 11/7/21 2:28 PM, Jarkko Sakkinen wrote:
> >> When you place PCMD in a page, you do a get_page().  The refcount goes
> >> up by one.  So, a PCMD page with one PCMD will (I think) have a refcount
> >> of 3.  If you totally fill it up with 31 *more* PCMD entries, it will
> >> have a refcount of 34.  You do *not* do a put_page() on the PCMD page at
> >> the end of the allocation phase.
> >>
> >> When the backing storage is freed, you drop the refcount.  So, going
> >> from 32 PCMD entries to 31 entries in a page, you go from 34->33.
> >>
> >> When that refcount drops to 2, you know there is no more data in the
> >> page that's useful.  At that point you can truncate it out of the
> >> backing storage.
> >>
> >> There's no reason to scan the page, or a boatload of other metadata.
> >> Just keep a refcount.  Just use the *existing* 'struct page' refcount.
> > Right! Thank you, I'll use this approach, and check that the refcount
> > actually behaves that way you described.
> 
> Thinking about this a bit more...  We don't want to use the normal
> get/put_page() refcount for this.  If we do, it will pin the page while
> there is any data in it, preventing it from being reclaimed (swapped).
> 
> That isn't to say that we can't keep *a* refcount, just that we can't
> use the page refcount for this.
> 
> I still like the idea of just scanning the whole page for zeros.

I can try that route first. I like the property in zeroing that it has
predicatable O(1) cost.

/Jarkko

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-11-08 20:07 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-03 23:22 [PATCH] x86/sgx: Free backing memory after faulting the enclave page Jarkko Sakkinen
2021-11-04 13:50 ` Dave Hansen
2021-11-04 15:04   ` Jarkko Sakkinen
2021-11-04 15:09     ` Jarkko Sakkinen
2021-11-04 15:13     ` Dave Hansen
2021-11-04 15:25       ` Jarkko Sakkinen
2021-11-04 15:29         ` Dave Hansen
2021-11-07 19:42           ` Jarkko Sakkinen
2021-11-07 19:51             ` Dave Hansen
2021-11-07 22:28               ` Jarkko Sakkinen
2021-11-08  6:34                 ` Dave Hansen
2021-11-08 20:07                   ` Jarkko Sakkinen
2021-11-04 22:38 ` Dave Hansen
2021-11-07 18:06   ` Jarkko Sakkinen
2021-11-07 19:06     ` Dave Hansen
2021-11-07 19:45       ` Jarkko Sakkinen
2021-11-07 19:58         ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).