All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kristen Carlson Accardi <kristen@linux.intel.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org,
	cgroups@vger.kernel.org, Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <songmuchun@bytedance.com>
Subject: Re: [RFC PATCH 00/20] Add Cgroup support for SGX EPC memory
Date: Thu, 22 Sep 2022 11:59:14 -0700	[thread overview]
Message-ID: <4b8605533e5deade739249bfb341ab9c06d56a1e.camel@linux.intel.com> (raw)
In-Reply-To: <YyyeSVSk/lWdo/W4@slm.duckdns.org>

On Thu, 2022-09-22 at 07:41 -1000, Tejun Heo wrote:
> Hello,
> 
> (cc'ing memcg folks)
> 
> On Thu, Sep 22, 2022 at 10:10:37AM -0700, Kristen Carlson Accardi
> wrote:
> > Add a new cgroup controller to regulate the distribution of SGX EPC
> > memory,
> > which is a subset of system RAM that is used to provide SGX-enabled
> > applications with protected memory, and is otherwise inaccessible.
> > 
> > SGX EPC memory allocations are separate from normal RAM
> > allocations,
> > and is managed solely by the SGX subsystem. The existing cgroup
> > memory
> > controller cannot be used to limit or account for SGX EPC memory.
> > 
> > This patchset implements the sgx_epc cgroup controller, which will
> > provide
> > support for stats, events, and the following interface files:
> > 
> > sgx_epc.current
> >         A read-only value which represents the total amount of EPC
> >         memory currently being used on by the cgroup and its
> > descendents.
> > 
> > sgx_epc.low
> >         A read-write value which is used to set best-effort
> > protection
> >         of EPC usage. If the EPC usage of a cgroup drops below this
> > value,
> >         then the cgroup's EPC memory will not be reclaimed if
> > possible.
> > 
> > sgx_epc.high
> >         A read-write value which is used to set a best-effort limit
> >         on the amount of EPC usage a cgroup has. If a cgroup's
> > usage
> >         goes past the high value, the EPC memory of that cgroup
> > will
> >         get reclaimed back under the high limit.
> > 
> > sgx_epc.max
> >         A read-write value which is used to set a hard limit for
> >         cgroup EPC usage. If a cgroup's EPC usage reaches this
> > limit,
> >         allocations are blocked until EPC memory can be reclaimed
> > from
> >         the cgroup.
> 
> I don't know how SGX uses its memory but you said in the other
> message that
> it's usually a really small portion of the memory and glancing the
> code it
> looks like its own page aging and all. Can you give some concrete
> examples
> on how it's used and why we need cgroup support for it? Also, do you
> really
> need all three control knobs here? e.g. given that .high is only
> really
> useful in conjunction with memory pressure and oom handling from
> userspace,
> I don't see how this would actually be useful for something like
> this.
> 
> Thanks.
> 

Thanks for your question. The SGX EPC memory is a global shared
resource that can be over committed. The SGX EPC controller should be
used similarly to the normal memory controller. Normally when there is
pressure on EPC memory, the reclaimer thread will write out pages from
EPC memory to a backing RAM that is allocated per enclave. It is
possible currently for even a single enclave to force all the other
enclaves to have their epc pages written to backing RAM by allocating
all the available system EPC memory. This can cause performance issues
for the enclaves when they have to fault to load pages page in.

sgx_epc.high value will help control the EPC usage of the cgroup. The
sgx reclaimer will use this value to prevent the total EPC usage of a
cgroup from exceeding this value (best effort). This way, if a system
administrator would like to try to prevent single enclaves, or groups
of enclaves from allocating all of the EPC memory and causing
performance issues for the other enclaves on the system, they can set
this limit. sgx_epc.max can be used to set a hard limit, which will
cause an enclave to get all it's used pages zapped and it will
effectively be killed until it is rebuilt by the owning sgx
application. sgx_epc.low can be used to (best effort) try to ensure
that some minimum amount of EPC pages are protected for enclaves in a
particular cgroup. This can be useful for preventing evictions and thus
performance issues due to faults.

I hope this answers your question.

Thanks,
Kristen


WARNING: multiple messages have this Message-ID (diff)
From: Kristen Carlson Accardi <kristen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-sgx-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Roman Gushchin
	<roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Subject: Re: [RFC PATCH 00/20] Add Cgroup support for SGX EPC memory
Date: Thu, 22 Sep 2022 11:59:14 -0700	[thread overview]
Message-ID: <4b8605533e5deade739249bfb341ab9c06d56a1e.camel@linux.intel.com> (raw)
In-Reply-To: <YyyeSVSk/lWdo/W4-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>

On Thu, 2022-09-22 at 07:41 -1000, Tejun Heo wrote:
> Hello,
> 
> (cc'ing memcg folks)
> 
> On Thu, Sep 22, 2022 at 10:10:37AM -0700, Kristen Carlson Accardi
> wrote:
> > Add a new cgroup controller to regulate the distribution of SGX EPC
> > memory,
> > which is a subset of system RAM that is used to provide SGX-enabled
> > applications with protected memory, and is otherwise inaccessible.
> > 
> > SGX EPC memory allocations are separate from normal RAM
> > allocations,
> > and is managed solely by the SGX subsystem. The existing cgroup
> > memory
> > controller cannot be used to limit or account for SGX EPC memory.
> > 
> > This patchset implements the sgx_epc cgroup controller, which will
> > provide
> > support for stats, events, and the following interface files:
> > 
> > sgx_epc.current
> >         A read-only value which represents the total amount of EPC
> >         memory currently being used on by the cgroup and its
> > descendents.
> > 
> > sgx_epc.low
> >         A read-write value which is used to set best-effort
> > protection
> >         of EPC usage. If the EPC usage of a cgroup drops below this
> > value,
> >         then the cgroup's EPC memory will not be reclaimed if
> > possible.
> > 
> > sgx_epc.high
> >         A read-write value which is used to set a best-effort limit
> >         on the amount of EPC usage a cgroup has. If a cgroup's
> > usage
> >         goes past the high value, the EPC memory of that cgroup
> > will
> >         get reclaimed back under the high limit.
> > 
> > sgx_epc.max
> >         A read-write value which is used to set a hard limit for
> >         cgroup EPC usage. If a cgroup's EPC usage reaches this
> > limit,
> >         allocations are blocked until EPC memory can be reclaimed
> > from
> >         the cgroup.
> 
> I don't know how SGX uses its memory but you said in the other
> message that
> it's usually a really small portion of the memory and glancing the
> code it
> looks like its own page aging and all. Can you give some concrete
> examples
> on how it's used and why we need cgroup support for it? Also, do you
> really
> need all three control knobs here? e.g. given that .high is only
> really
> useful in conjunction with memory pressure and oom handling from
> userspace,
> I don't see how this would actually be useful for something like
> this.
> 
> Thanks.
> 

Thanks for your question. The SGX EPC memory is a global shared
resource that can be over committed. The SGX EPC controller should be
used similarly to the normal memory controller. Normally when there is
pressure on EPC memory, the reclaimer thread will write out pages from
EPC memory to a backing RAM that is allocated per enclave. It is
possible currently for even a single enclave to force all the other
enclaves to have their epc pages written to backing RAM by allocating
all the available system EPC memory. This can cause performance issues
for the enclaves when they have to fault to load pages page in.

sgx_epc.high value will help control the EPC usage of the cgroup. The
sgx reclaimer will use this value to prevent the total EPC usage of a
cgroup from exceeding this value (best effort). This way, if a system
administrator would like to try to prevent single enclaves, or groups
of enclaves from allocating all of the EPC memory and causing
performance issues for the other enclaves on the system, they can set
this limit. sgx_epc.max can be used to set a hard limit, which will
cause an enclave to get all it's used pages zapped and it will
effectively be killed until it is rebuilt by the owning sgx
application. sgx_epc.low can be used to (best effort) try to ensure
that some minimum amount of EPC pages are protected for enclaves in a
particular cgroup. This can be useful for preventing evictions and thus
performance issues due to faults.

I hope this answers your question.

Thanks,
Kristen


  reply	other threads:[~2022-09-22 18:59 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 17:10 [RFC PATCH 00/20] Add Cgroup support for SGX EPC memory Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 01/20] x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages() Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-23 12:32   ` Jarkko Sakkinen
2022-09-23 12:32     ` Jarkko Sakkinen
2022-09-23 12:35     ` Jarkko Sakkinen
2022-09-23 12:35       ` Jarkko Sakkinen
2022-09-23 12:37       ` Jarkko Sakkinen
2022-09-23 12:37         ` Jarkko Sakkinen
2022-09-22 17:10 ` [RFC PATCH 02/20] x86/sgx: Store EPC page owner as a 'void *' to handle multiple users Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 18:54   ` Dave Hansen
2022-09-22 18:54     ` Dave Hansen
2022-09-23 12:49   ` Jarkko Sakkinen
2022-09-23 12:49     ` Jarkko Sakkinen
2022-09-22 17:10 ` [RFC PATCH 03/20] x86/sgx: Track owning enclave in VA EPC pages Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 18:55   ` Dave Hansen
2022-09-22 18:55     ` Dave Hansen
2022-09-22 20:04     ` Kristen Carlson Accardi
2022-09-22 20:04       ` Kristen Carlson Accardi
2022-09-22 21:39       ` Dave Hansen
2022-09-22 21:39         ` Dave Hansen
2022-09-23 12:52   ` Jarkko Sakkinen
2022-09-23 12:52     ` Jarkko Sakkinen
2022-09-22 17:10 ` [RFC PATCH 04/20] x86/sgx: Add 'struct sgx_epc_lru' to encapsulate lru list(s) Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-23 13:20   ` Jarkko Sakkinen
2022-09-23 13:20     ` Jarkko Sakkinen
2022-09-29 23:04     ` Kristen Carlson Accardi
2022-09-29 23:04       ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 05/20] x86/sgx: Introduce unreclaimable EPC page lists Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-23 13:29   ` Jarkko Sakkinen
2022-09-23 13:29     ` Jarkko Sakkinen
2022-09-22 17:10 ` [RFC PATCH 06/20] x86/sgx: Introduce RECLAIM_IN_PROGRESS flag for EPC pages Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 07/20] x86/sgx: Use a list to track to-be-reclaimed pages during reclaim Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 08/20] x86/sgx: Add EPC page flags to identify type of page Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 09/20] x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 10/20] x86/sgx: Return the number of EPC pages that were successfully reclaimed Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 11/20] x86/sgx: Add option to ignore age of page during EPC reclaim Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 12/20] x86/sgx: Add helper to retrieve SGX EPC LRU given an EPC page Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 13/20] x86/sgx: Prepare for multiple LRUs Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 14/20] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 15/20] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 16/20] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 17/20] cgroup, x86/sgx: Add SGX EPC cgroup controller Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 18/20] x86/sgx: Enable EPC cgroup controller in SGX core Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 19/20] x86/sgx: Add stats and events interfaces to EPC cgroup controller Kristen Carlson Accardi
2022-09-22 17:10   ` Kristen Carlson Accardi
2022-09-22 17:10 ` [RFC PATCH 20/20] docs, cgroup, x86/sgx: Add SGX EPC cgroup controller documentation Kristen Carlson Accardi
2022-09-22 17:41 ` [RFC PATCH 00/20] Add Cgroup support for SGX EPC memory Tejun Heo
2022-09-22 17:41   ` Tejun Heo
2022-09-22 18:59   ` Kristen Carlson Accardi [this message]
2022-09-22 18:59     ` Kristen Carlson Accardi
2022-09-22 19:08     ` Tejun Heo
2022-09-22 19:08       ` Tejun Heo
2022-09-22 21:03       ` Dave Hansen
2022-09-22 21:03         ` Dave Hansen
2022-09-24  0:09         ` Tejun Heo
2022-09-24  0:09           ` Tejun Heo
2022-09-26 18:30           ` Kristen Carlson Accardi
2022-09-26 18:30             ` Kristen Carlson Accardi
2022-10-07 16:39           ` Kristen Carlson Accardi
2022-10-07 16:39             ` Kristen Carlson Accardi
2022-10-07 16:42             ` Tejun Heo
2022-10-07 16:42               ` Tejun Heo
2022-10-07 16:46               ` Kristen Carlson Accardi
2022-10-07 16:46                 ` Kristen Carlson Accardi
2022-09-23 12:24 ` Jarkko Sakkinen
2022-09-23 12:24   ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b8605533e5deade739249bfb341ab9c06d56a1e.camel@linux.intel.com \
    --to=kristen@linux.intel.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.