Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit

linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit
@ 2022-01-28  1:30 Anand Krishnamoorthi
  2022-01-31 17:40 ` Accardi, Kristen C
  0 siblings, 1 reply; 5+ messages in thread
From: Anand Krishnamoorthi @ 2022-01-28  1:30 UTC (permalink / raw)
  To: linux-sgx

Some of our users rely on EPC paging to run enclaves that require GBs of memory on Coffee Lake VMs with less than 200MB EPC.
Would the 1.5x EPC backing pages limit prevent these users from running their enclaves?

- Anand

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit
  2022-01-28  1:30 [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit Anand Krishnamoorthi
@ 2022-01-31 17:40 ` Accardi, Kristen C
  2022-02-01 14:46   ` Anand Krishnamoorthi
  0 siblings, 1 reply; 5+ messages in thread
From: Accardi, Kristen C @ 2022-01-31 17:40 UTC (permalink / raw)
  To: linux-sgx, anakrish

On Fri, 2022-01-28 at 01:30 +0000, Anand Krishnamoorthi wrote:
> Some of our users rely on EPC paging to run enclaves that require GBs
> of memory on Coffee Lake VMs with less than 200MB EPC.
> Would the 1.5x EPC backing pages limit prevent these users from
> running their enclaves?
> 
> - Anand

If you had 200MB of EPC, you'd be limited to 300MB of backing pages.
The goal of this patch was to prevent users from using up all the
memory on the system with backing pages, which having GB of backing
pages would do pretty quickly. A module param which the admin used to
adjust the overcommit percentage was proposed to handle cases like this
where users would want to determine the level of overcommit they
allowed. Would something like this be used by your users?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit
  2022-01-31 17:40 ` Accardi, Kristen C
@ 2022-02-01 14:46   ` Anand Krishnamoorthi
  0 siblings, 0 replies; 5+ messages in thread
From: Anand Krishnamoorthi @ 2022-02-01 14:46 UTC (permalink / raw)
  To: Accardi, Kristen C, linux-sgx, Bo Zhang (ACC)

> A module param which the admin used to
adjust the overcommit percentage was proposed to handle cases like this
where users would want to determine the level of overcommit they
allowed. Would something like this be used by your users?

A module param with a high default value (in GBs for CFL systems) would allow flexibility.
Would changing the module param require a reboot or just a module reload?

________________________________________
From: Accardi, Kristen C <kristen.c.accardi@intel.com>
Sent: Monday, January 31, 2022 9:40 AM
To: linux-sgx@vger.kernel.org; Anand Krishnamoorthi
Subject: [EXTERNAL] Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit

On Fri, 2022-01-28 at 01:30 +0000, Anand Krishnamoorthi wrote:
> Some of our users rely on EPC paging to run enclaves that require GBs
> of memory on Coffee Lake VMs with less than 200MB EPC.
> Would the 1.5x EPC backing pages limit prevent these users from
> running their enclaves?
>
> - Anand

If you had 200MB of EPC, you'd be limited to 300MB of backing pages.
The goal of this patch was to prevent users from using up all the
memory on the system with backing pages, which having GB of backing
pages would do pretty quickly. A module param which the admin used to
adjust the overcommit percentage was proposed to handle cases like this
where users would want to determine the level of overcommit they
allowed. Would something like this be used by your users?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit
  2022-01-18 17:57 ` [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit Kristen Carlson Accardi
@ 2022-01-20 13:07   ` Jarkko Sakkinen
  0 siblings, 0 replies; 5+ messages in thread
From: Jarkko Sakkinen @ 2022-01-20 13:07 UTC (permalink / raw)
  To: Kristen Carlson Accardi, linux-sgx, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin
  Cc: linux-kernel

On Tue, 2022-01-18 at 09:57 -0800, Kristen Carlson Accardi wrote:
> When the system runs out of enclave memory, SGX can reclaim EPC pages
> by swapping to normal RAM. This normal RAM is allocated via a
> per-enclave shared memory area. The shared memory area is not mapped
> into the enclave or the task mapping it, which makes its memory use
> opaque (including to the OOM killer). Having lots of hard to find
> memory around is problematic, especially when there is no limit.
> 
> Introduce a global counter that can be used to limit the number of
> pages
> that enclaves are able to consume for backing storage.  This
> parameter
> is a percentage value that is used in conjunction with the number of
> EPC pages in the system to set a cap on the amount of backing RAM
> that
> can be consumed.
> 
> The default for this value is 150, which limits the total number of
> shared memory pages that may be consumed by all enclaves as backing
> pages to 1.5X of EPC pages on the system. For example, on an SGX
> system that has 128MB of EPC, this default would cap the amount of
> normal RAM that SGX consumes for its shared memory areas at 192MB.
> The value of 1.5x the number of EPC pages was chosen because it
> should
> handle the most common case of a few enclaves that don't need much
> overcommit without any impact to user space. In the less common case
> where there are many enclaves, or a few large enclaves which need
> a lot of overcommit due to large EPC memory requirements, the
> reclaimer may fail to allocate a backing page for swapping if the
> limit has been reached. In this case, the page will not be able
> to allocate any new EPC pages. Any ioctl or call to add new EPC
> pages will get -ENOMEM, so for example, new enclaves will fail to
> load, and new EPC pages will not be able to be added.
> 
> The SGX overcommit_percent works differently than the core VM
> overcommit
> limit. Enclaves request backing pages one page at a time, and the
> number
> of in use backing pages that are allowed is a global resource that is
> limited for all enclaves.
> 
> Introduce a pair of functions which can be used by callers when
> requesting
> backing RAM pages. These functions are responsible for accounting the
> page charges. A request may return an error if the request will cause
> the
> counter to exceed the backing page cap.
> 
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 45
> ++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/sgx/sgx.h  |  2 ++
>  2 files changed, 47 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c
> b/arch/x86/kernel/cpu/sgx/main.c
> index 2857a49f2335..261e3702aef9 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -43,6 +43,45 @@ static struct sgx_numa_node *sgx_numa_nodes;
>  
>  static LIST_HEAD(sgx_dirty_page_list);
>  
> +/*
> + * Limits the amount of normal RAM that SGX can consume for EPC
> + * overcommit to the total EPC pages * sgx_overcommit_percent / 100
> + */
> +static int sgx_overcommit_percent = 150;
> +
> +/* The number of pages that can be allocated globally for backing
> storage. */
> +static atomic_long_t sgx_nr_available_backing_pages;
> +
> +/**
> + * sgx_charge_mem() - charge for a page used for backing storage
> + *
> + * Backing storage usage is capped by the
> sgx_nr_available_backing_pages.
> + * If the backing storage usage is over the overcommit limit,
> + * return an error.
> + *
> + * Return:
> + * 0:          The page requested does not exceed the limit
> + * -ENOMEM:    The page requested exceeds the overcommit limit
> + */
> +int sgx_charge_mem(void)
> +{
> +       if (!atomic_long_add_unless(&sgx_nr_available_backing_pages,
> -1, 0))
> +               return -ENOMEM;
> +
> +       return 0;
> +}
> +
> +/**
> + * sgx_uncharge_mem() - uncharge a page previously used for backing
> storage
> + *
> + * When backing storage is no longer in use, increment the
> + * sgx_nr_available_backing_pages counter.
> + */
> +void sgx_uncharge_mem(void)
> +{
> +       atomic_long_inc(&sgx_nr_available_backing_pages);
> +}
> +
>  /*
>   * Reset post-kexec EPC pages to the uninitialized state. The pages
> are removed
>   * from the input list, and made available for the page allocator.
> SECS pages
> @@ -783,6 +822,8 @@ static inline u64 __init
> sgx_calc_section_metric(u64 low, u64 high)
>  static bool __init sgx_page_cache_init(void)
>  {
>         u32 eax, ebx, ecx, edx, type;
> +       u64 available_backing_bytes;
> +       u64 total_epc_bytes = 0;
>         u64 pa, size;
>         int nid;
>         int i;
> @@ -830,6 +871,7 @@ static bool __init sgx_page_cache_init(void)
>  
>                 sgx_epc_sections[i].node =  &sgx_numa_nodes[nid];
>                 sgx_numa_nodes[nid].size += size;
> +               total_epc_bytes += size;
>  
>                 sgx_nr_epc_sections++;
>         }
> @@ -839,6 +881,9 @@ static bool __init sgx_page_cache_init(void)
>                 return false;
>         }
>  
> +       available_backing_bytes = total_epc_bytes *
> (sgx_overcommit_percent / 100);
> +       atomic_long_set(&sgx_nr_available_backing_pages,
> available_backing_bytes >> PAGE_SHIFT);
> +
>         return true;
>  }
>  
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h
> b/arch/x86/kernel/cpu/sgx/sgx.h
> index 0f17def9fe6f..3507a9983fc1 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -89,6 +89,8 @@ void sgx_free_epc_page(struct sgx_epc_page *page);
>  void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
>  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
>  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
> +int sgx_charge_mem(void);
> +void sgx_uncharge_mem(void);
>  
>  #ifdef CONFIG_X86_SGX_KVM
>  int __init sgx_vepc_init(void);

For me this looks cool. I also found out where the charge keyword comes
from while looking at shmem code for doing patches to add the checks that
Dave suggested (shmem_charge(), shmem_uncharge()).

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit
  2022-01-18 17:57 [PATCH v3 0/2] x86/sgx: Limit EPC overcommit Kristen Carlson Accardi
@ 2022-01-18 17:57 ` Kristen Carlson Accardi
  2022-01-20 13:07   ` Jarkko Sakkinen
  0 siblings, 1 reply; 5+ messages in thread
From: Kristen Carlson Accardi @ 2022-01-18 17:57 UTC (permalink / raw)
  To: linux-sgx, Jarkko Sakkinen, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin
  Cc: linux-kernel, Kristen Carlson Accardi

When the system runs out of enclave memory, SGX can reclaim EPC pages
by swapping to normal RAM. This normal RAM is allocated via a
per-enclave shared memory area. The shared memory area is not mapped
into the enclave or the task mapping it, which makes its memory use
opaque (including to the OOM killer). Having lots of hard to find
memory around is problematic, especially when there is no limit.

Introduce a global counter that can be used to limit the number of pages
that enclaves are able to consume for backing storage.  This parameter
is a percentage value that is used in conjunction with the number of
EPC pages in the system to set a cap on the amount of backing RAM that
can be consumed.

The default for this value is 150, which limits the total number of
shared memory pages that may be consumed by all enclaves as backing
pages to 1.5X of EPC pages on the system. For example, on an SGX
system that has 128MB of EPC, this default would cap the amount of
normal RAM that SGX consumes for its shared memory areas at 192MB.
The value of 1.5x the number of EPC pages was chosen because it should
handle the most common case of a few enclaves that don't need much
overcommit without any impact to user space. In the less common case
where there are many enclaves, or a few large enclaves which need
a lot of overcommit due to large EPC memory requirements, the
reclaimer may fail to allocate a backing page for swapping if the
limit has been reached. In this case, the page will not be able
to allocate any new EPC pages. Any ioctl or call to add new EPC
pages will get -ENOMEM, so for example, new enclaves will fail to
load, and new EPC pages will not be able to be added.

The SGX overcommit_percent works differently than the core VM overcommit
limit. Enclaves request backing pages one page at a time, and the number
of in use backing pages that are allowed is a global resource that is
limited for all enclaves.

Introduce a pair of functions which can be used by callers when requesting
backing RAM pages. These functions are responsible for accounting the
page charges. A request may return an error if the request will cause the
counter to exceed the backing page cap.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
---
 arch/x86/kernel/cpu/sgx/main.c | 45 ++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/sgx.h  |  2 ++
 2 files changed, 47 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 2857a49f2335..261e3702aef9 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -43,6 +43,45 @@ static struct sgx_numa_node *sgx_numa_nodes;

 static LIST_HEAD(sgx_dirty_page_list);

+/*
+ * Limits the amount of normal RAM that SGX can consume for EPC
+ * overcommit to the total EPC pages * sgx_overcommit_percent / 100
+ */
+static int sgx_overcommit_percent = 150;
+
+/* The number of pages that can be allocated globally for backing storage. */
+static atomic_long_t sgx_nr_available_backing_pages;
+
+/**
+ * sgx_charge_mem() - charge for a page used for backing storage
+ *
+ * Backing storage usage is capped by the sgx_nr_available_backing_pages.
+ * If the backing storage usage is over the overcommit limit,
+ * return an error.
+ *
+ * Return:
+ * 0:		The page requested does not exceed the limit
+ * -ENOMEM:	The page requested exceeds the overcommit limit
+ */
+int sgx_charge_mem(void)
+{
+	if (!atomic_long_add_unless(&sgx_nr_available_backing_pages, -1, 0))
+		return -ENOMEM;
+
+	return 0;
+}
+
+/**
+ * sgx_uncharge_mem() - uncharge a page previously used for backing storage
+ *
+ * When backing storage is no longer in use, increment the
+ * sgx_nr_available_backing_pages counter.
+ */
+void sgx_uncharge_mem(void)
+{
+	atomic_long_inc(&sgx_nr_available_backing_pages);
+}
+
 /*
  * Reset post-kexec EPC pages to the uninitialized state. The pages are removed
  * from the input list, and made available for the page allocator. SECS pages
@@ -783,6 +822,8 @@ static inline u64 __init sgx_calc_section_metric(u64 low, u64 high)
 static bool __init sgx_page_cache_init(void)
 {
 	u32 eax, ebx, ecx, edx, type;
+	u64 available_backing_bytes;
+	u64 total_epc_bytes = 0;
 	u64 pa, size;
 	int nid;
 	int i;
@@ -830,6 +871,7 @@ static bool __init sgx_page_cache_init(void)

 		sgx_epc_sections[i].node =  &sgx_numa_nodes[nid];
 		sgx_numa_nodes[nid].size += size;
+		total_epc_bytes += size;

 		sgx_nr_epc_sections++;
 	}
@@ -839,6 +881,9 @@ static bool __init sgx_page_cache_init(void)
 		return false;
 	}

+	available_backing_bytes = total_epc_bytes * (sgx_overcommit_percent / 100);
+	atomic_long_set(&sgx_nr_available_backing_pages, available_backing_bytes >> PAGE_SHIFT);
+
 	return true;
 }

diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 0f17def9fe6f..3507a9983fc1 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -89,6 +89,8 @@ void sgx_free_epc_page(struct sgx_epc_page *page);
 void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
+int sgx_charge_mem(void);
+void sgx_uncharge_mem(void);

 #ifdef CONFIG_X86_SGX_KVM
 int __init sgx_vepc_init(void);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-02-01 14:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28  1:30 [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit Anand Krishnamoorthi
2022-01-31 17:40 ` Accardi, Kristen C
2022-02-01 14:46   ` Anand Krishnamoorthi
  -- strict thread matches above, loose matches on Subject: below --
2022-01-18 17:57 [PATCH v3 0/2] x86/sgx: Limit EPC overcommit Kristen Carlson Accardi
2022-01-18 17:57 ` [PATCH v3 1/2] x86/sgx: Add accounting for tracking overcommit Kristen Carlson Accardi
2022-01-20 13:07   ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).