All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fixups for overcommit limit pages
@ 2022-01-26 19:17 Kristen Carlson Accardi
  2022-01-26 19:17 ` [PATCH 1/2] x86/sgx: fixup for available backing pages calculation Kristen Carlson Accardi
  2022-01-26 19:17 ` [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure Kristen Carlson Accardi
  0 siblings, 2 replies; 7+ messages in thread
From: Kristen Carlson Accardi @ 2022-01-26 19:17 UTC (permalink / raw)
  To: linux-sgx, dave.hansen; +Cc: haitao.huang, jarkko

Hi Dave,
Here are some fixups for the patches in your tree for the overcommit
limit. The first was send earlier but is a resend. The second fixes
an issue reported by Haitao with hanging while attempting to add pages
when the limit is reached.

Kristen Carlson Accardi (2):
  x86/sgx: fixup for available backing pages calculation
  x86/sgx: Allow sgx_reclaim_pages() to report failure

 arch/x86/kernel/cpu/sgx/main.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] x86/sgx: fixup for available backing pages calculation
  2022-01-26 19:17 [PATCH 0/2] Fixups for overcommit limit pages Kristen Carlson Accardi
@ 2022-01-26 19:17 ` Kristen Carlson Accardi
  2022-01-26 20:17   ` Jarkko Sakkinen
  2022-01-26 19:17 ` [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure Kristen Carlson Accardi
  1 sibling, 1 reply; 7+ messages in thread
From: Kristen Carlson Accardi @ 2022-01-26 19:17 UTC (permalink / raw)
  To: linux-sgx, dave.hansen; +Cc: haitao.huang, jarkko

Remove improper parentheses from calculation for available
backing bytes. Wtihout this fix, the result will be incorrect
due to rounding.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
---
 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 990f341bbd30..c4030fb608c6 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -881,7 +881,7 @@ static bool __init sgx_page_cache_init(void)
 		return false;
 	}
 
-	available_backing_bytes = total_epc_bytes * (sgx_overcommit_percent / 100);
+	available_backing_bytes = total_epc_bytes * sgx_overcommit_percent / 100;
 	atomic_long_set(&sgx_nr_available_backing_pages, available_backing_bytes >> PAGE_SHIFT);
 
 	return true;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure
  2022-01-26 19:17 [PATCH 0/2] Fixups for overcommit limit pages Kristen Carlson Accardi
  2022-01-26 19:17 ` [PATCH 1/2] x86/sgx: fixup for available backing pages calculation Kristen Carlson Accardi
@ 2022-01-26 19:17 ` Kristen Carlson Accardi
  2022-01-26 20:19   ` Jarkko Sakkinen
  2022-01-28 18:26   ` Dave Hansen
  1 sibling, 2 replies; 7+ messages in thread
From: Kristen Carlson Accardi @ 2022-01-26 19:17 UTC (permalink / raw)
  To: linux-sgx, dave.hansen; +Cc: haitao.huang, jarkko

If backing pages are not able to be allocated during
sgx_reclaim_pages(), return an error code to the caller.
sgx_reclaim_pages() can be called from the reclaimer thread,
or when adding pages via an ioctl. When it is called from the
kernel thread, it's safe to ignore the return value, however,
calls from the ioctls should forward the error.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index c4030fb608c6..0e95f69ebcb7 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -377,17 +377,18 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
  * problematic as it would increase the lock contention too much, which would
  * halt forward progress.
  */
-static void sgx_reclaim_pages(void)
+static int sgx_reclaim_pages(void)
 {
 	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
 	struct sgx_backing backing[SGX_NR_TO_SCAN];
 	struct sgx_epc_section *section;
 	struct sgx_encl_page *encl_page;
+	int pages_being_reclaimed = 0;
 	struct sgx_epc_page *epc_page;
 	struct sgx_numa_node *node;
 	pgoff_t page_index;
 	int cnt = 0;
-	int ret;
+	int ret = 0;
 	int i;
 
 	spin_lock(&sgx_reclaimer_lock);
@@ -422,6 +423,8 @@ static void sgx_reclaim_pages(void)
 		if (ret)
 			goto skip;
 
+		pages_being_reclaimed++;
+
 		mutex_lock(&encl_page->encl->lock);
 		encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
 		mutex_unlock(&encl_page->encl->lock);
@@ -437,6 +440,9 @@ static void sgx_reclaim_pages(void)
 		chunk[i] = NULL;
 	}
 
+	if (!pages_being_reclaimed)
+		return ret;
+
 	for (i = 0; i < cnt; i++) {
 		epc_page = chunk[i];
 		if (epc_page)
@@ -463,6 +469,7 @@ static void sgx_reclaim_pages(void)
 		spin_unlock(&node->lock);
 		atomic_long_inc(&sgx_nr_free_pages);
 	}
+	return ret;
 }
 
 static bool sgx_should_reclaim(unsigned long watermark)
@@ -636,6 +643,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
 {
 	struct sgx_epc_page *page;
+	int ret;
 
 	for ( ; ; ) {
 		page = __sgx_alloc_epc_page();
@@ -657,7 +665,11 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
 			break;
 		}
 
-		sgx_reclaim_pages();
+		ret = sgx_reclaim_pages();
+		if (ret) {
+			page = ERR_PTR(-ENOMEM);
+			break;
+		}
 		cond_resched();
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] x86/sgx: fixup for available backing pages calculation
  2022-01-26 19:17 ` [PATCH 1/2] x86/sgx: fixup for available backing pages calculation Kristen Carlson Accardi
@ 2022-01-26 20:17   ` Jarkko Sakkinen
  0 siblings, 0 replies; 7+ messages in thread
From: Jarkko Sakkinen @ 2022-01-26 20:17 UTC (permalink / raw)
  To: Kristen Carlson Accardi; +Cc: linux-sgx, dave.hansen, haitao.huang

On Wed, Jan 26, 2022 at 11:17:10AM -0800, Kristen Carlson Accardi wrote:
> Remove improper parentheses from calculation for available
> backing bytes. Wtihout this fix, the result will be incorrect
> due to rounding.
> 
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 990f341bbd30..c4030fb608c6 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -881,7 +881,7 @@ static bool __init sgx_page_cache_init(void)
>  		return false;
>  	}
>  
> -	available_backing_bytes = total_epc_bytes * (sgx_overcommit_percent / 100);
> +	available_backing_bytes = total_epc_bytes * sgx_overcommit_percent / 100;
>  	atomic_long_set(&sgx_nr_available_backing_pages, available_backing_bytes >> PAGE_SHIFT);
>  
>  	return true;
> -- 
> 2.20.1
> 

You can turn it as:

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure
  2022-01-26 19:17 ` [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure Kristen Carlson Accardi
@ 2022-01-26 20:19   ` Jarkko Sakkinen
  2022-01-26 20:20     ` Jarkko Sakkinen
  2022-01-28 18:26   ` Dave Hansen
  1 sibling, 1 reply; 7+ messages in thread
From: Jarkko Sakkinen @ 2022-01-26 20:19 UTC (permalink / raw)
  To: Kristen Carlson Accardi; +Cc: linux-sgx, dave.hansen, haitao.huang

On Wed, Jan 26, 2022 at 11:17:11AM -0800, Kristen Carlson Accardi wrote:
> If backing pages are not able to be allocated during
> sgx_reclaim_pages(), return an error code to the caller.
> sgx_reclaim_pages() can be called from the reclaimer thread,
> or when adding pages via an ioctl. When it is called from the
> kernel thread, it's safe to ignore the return value, however,
> calls from the ioctls should forward the error.
> 
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index c4030fb608c6..0e95f69ebcb7 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -377,17 +377,18 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
>   * problematic as it would increase the lock contention too much, which would
>   * halt forward progress.
>   */
> -static void sgx_reclaim_pages(void)
> +static int sgx_reclaim_pages(void)
>  {
>  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>  	struct sgx_backing backing[SGX_NR_TO_SCAN];
>  	struct sgx_epc_section *section;
>  	struct sgx_encl_page *encl_page;
> +	int pages_being_reclaimed = 0;
 
LGTM but why this is signed?

/Jarkko
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure
  2022-01-26 20:19   ` Jarkko Sakkinen
@ 2022-01-26 20:20     ` Jarkko Sakkinen
  0 siblings, 0 replies; 7+ messages in thread
From: Jarkko Sakkinen @ 2022-01-26 20:20 UTC (permalink / raw)
  To: Kristen Carlson Accardi; +Cc: linux-sgx, dave.hansen, haitao.huang

On Wed, Jan 26, 2022 at 10:19:26PM +0200, Jarkko Sakkinen wrote:
> On Wed, Jan 26, 2022 at 11:17:11AM -0800, Kristen Carlson Accardi wrote:
> > If backing pages are not able to be allocated during
> > sgx_reclaim_pages(), return an error code to the caller.
> > sgx_reclaim_pages() can be called from the reclaimer thread,
> > or when adding pages via an ioctl. When it is called from the
> > kernel thread, it's safe to ignore the return value, however,
> > calls from the ioctls should forward the error.
> > 
> > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> > ---
> >  arch/x86/kernel/cpu/sgx/main.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index c4030fb608c6..0e95f69ebcb7 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -377,17 +377,18 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
> >   * problematic as it would increase the lock contention too much, which would
> >   * halt forward progress.
> >   */
> > -static void sgx_reclaim_pages(void)
> > +static int sgx_reclaim_pages(void)
> >  {
> >  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
> >  	struct sgx_backing backing[SGX_NR_TO_SCAN];
> >  	struct sgx_epc_section *section;
> >  	struct sgx_encl_page *encl_page;
> > +	int pages_being_reclaimed = 0;
>  
> LGTM but why this is signed?

Not that it mattered tho, just interested.

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure
  2022-01-26 19:17 ` [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure Kristen Carlson Accardi
  2022-01-26 20:19   ` Jarkko Sakkinen
@ 2022-01-28 18:26   ` Dave Hansen
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Hansen @ 2022-01-28 18:26 UTC (permalink / raw)
  To: Kristen Carlson Accardi, linux-sgx; +Cc: haitao.huang, jarkko

On 1/26/22 11:17, Kristen Carlson Accardi wrote:
> If backing pages are not able to be allocated during
> sgx_reclaim_pages(), return an error code to the caller.
> sgx_reclaim_pages() can be called from the reclaimer thread,
> or when adding pages via an ioctl. When it is called from the

ioctl() is a function name.  Please add parenthesis to make that clear,
just like any other function name.

> kernel thread, it's safe to ignore the return value, however,
> calls from the ioctls should forward the error.
> 
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index c4030fb608c6..0e95f69ebcb7 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -377,17 +377,18 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
>   * problematic as it would increase the lock contention too much, which would
>   * halt forward progress.
>   */
> -static void sgx_reclaim_pages(void)
> +static int sgx_reclaim_pages(void)
>  {
>  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>  	struct sgx_backing backing[SGX_NR_TO_SCAN];
>  	struct sgx_epc_section *section;
>  	struct sgx_encl_page *encl_page;
> +	int pages_being_reclaimed = 0;
>  	struct sgx_epc_page *epc_page;
>  	struct sgx_numa_node *node;
>  	pgoff_t page_index;
>  	int cnt = 0;
> -	int ret;
> +	int ret = 0;
>  	int i;
>  
>  	spin_lock(&sgx_reclaimer_lock);
> @@ -422,6 +423,8 @@ static void sgx_reclaim_pages(void)
>  		if (ret)
>  			goto skip;
>  
> +		pages_being_reclaimed++;

I think we can do better on the naming there.  Yes, this is number of
pages which have 'SGX_ENCL_PAGE_BEING_RECLAIMED' set, but what does that
*mean*?  What are the implications?

For instance, 'backing_pages_allocated' would imply that there are
future backing pages to use (or clean up).

>  		mutex_lock(&encl_page->encl->lock);
>  		encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
>  		mutex_unlock(&encl_page->encl->lock);
> @@ -437,6 +440,9 @@ static void sgx_reclaim_pages(void)
>  		chunk[i] = NULL;
>  	}
>  
> +	if (!pages_being_reclaimed)
> +		return ret;

I think this needs a comment.  It will return the error from the *last*
failure of sgx_encl_get_backing().  That's fine, I guess, but it's a bit
weird because there could have been 100 errors and the first 99 errors
are ignored.

>  	for (i = 0; i < cnt; i++) {
>  		epc_page = chunk[i];
>  		if (epc_page)
> @@ -463,6 +469,7 @@ static void sgx_reclaim_pages(void)
>  		spin_unlock(&node->lock);
>  		atomic_long_inc(&sgx_nr_free_pages);
>  	}
> +	return ret;
>  }


Let's say cnt=2.  The first run through the loop does
sgx_encl_get_backing() and increments pages_being_reclaimed.  The second
run through the loop hits an error, sets ret=-ESOMETHING.  The loop
terminates.

	if (!pages_being_reclaimed) <-- false
		return ret;

	... keep running

Then, we get to the bottom of the function.  One page was reclaimed.
Success!  But, ret=-ESOMETHING.  sgx_reclaim_pages() will return an error.

Right?

I think this is structured wrong.  In the end, we want to know whether
sgx_reclaim_pages() made any progress.  Let's have it return *that*.
How many pages did it successfully reclaim?

That has some really nice properties, especially if we wait until the
last possible moment to manipulate the count.  Perhaps:

static int sgx_reclaim_pages(void)
{
	...
	int nr_pages_reclaimed = 0;
	...

	// The last loop:
        for (i = 0; i < cnt; i++) {
                epc_page = chunk[i];
                if (!epc_page)
                        continue;
		...
                atomic_long_inc(&sgx_nr_free_pages);
		nr_pages_reclaimed++
        }

	return nr_pages_reclaimed;
}

That makes it *blatantly* obvious what the function returns since the
only manipulation of 'nr_pages_reclaimed' is right next to the return.
It also makes the function resilient to any new points of failure.
Let's say that we want to check for sgx_reclaimer_block() failures.
With this patch's approach, we have to add a new check and return for
"pages_being_reclaimed", or even an entirely new counter.

With the approach I'm suggesting, it "just works".

>  static bool sgx_should_reclaim(unsigned long watermark)
> @@ -636,6 +643,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
>  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
>  {
>  	struct sgx_epc_page *page;
> +	int ret;
>  
>  	for ( ; ; ) {
>  		page = __sgx_alloc_epc_page();
> @@ -657,7 +665,11 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
>  			break;
>  		}
>  
> -		sgx_reclaim_pages();
> +		ret = sgx_reclaim_pages();
> +		if (ret) {
> +			page = ERR_PTR(-ENOMEM);
> +			break;
> +		}
>  		cond_resched();
>  	}

So, we go to the trouble of plumbing a real error code out of
sgx_reclaim_pages(), but then throw it away here.  Why?


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-28 18:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-26 19:17 [PATCH 0/2] Fixups for overcommit limit pages Kristen Carlson Accardi
2022-01-26 19:17 ` [PATCH 1/2] x86/sgx: fixup for available backing pages calculation Kristen Carlson Accardi
2022-01-26 20:17   ` Jarkko Sakkinen
2022-01-26 19:17 ` [PATCH 2/2] x86/sgx: Allow sgx_reclaim_pages() to report failure Kristen Carlson Accardi
2022-01-26 20:19   ` Jarkko Sakkinen
2022-01-26 20:20     ` Jarkko Sakkinen
2022-01-28 18:26   ` Dave Hansen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.