linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Krishnamoorthi <anakrish@microsoft.com>
To: Kristen Carlson Accardi <kristen@linux.intel.com>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"tj@kernel.org" <tj@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"Bo Zhang (ACC)" <zhanb@microsoft.com>
Cc: "zhiquan1.li@intel.com" <zhiquan1.li@intel.com>
Subject: Re: [EXTERNAL] [PATCH v2 00/18]  Add Cgroup support for SGX EPC memory
Date: Thu, 13 Apr 2023 18:49:53 +0000	[thread overview]
Message-ID: <DM6PR21MB11772A6ED915825854B419D6C4989@DM6PR21MB1177.namprd21.prod.outlook.com> (raw)
In-Reply-To: <DM6PR21MB11778ABA54CD33A5822D5B24C4929@DM6PR21MB1177.namprd21.prod.outlook.com>

For Azure, SGX cgroup support feature is very useful.
It is needed to enforce the EPC resource limitation of Kubernetes pods on SGX nodes.

Today, in Azure Kubernetes Service, each pod on SGX node claims a nominal EPC memory requirement. K8s will track the unclaimed EPC memories on SGX nodes to schedule pods.
However, there's no enforcement on the node whether the pod uses more EPC memory than what it claims. If EPC is running out on the node, the kernel will do EPC paging, which will cause all pods suffering performance degradation.

Cgroup support for EPC will enforce EPC resource limitation on pod level, so that when a pod tries to use more EPC than what it claims, it will be EPC paged while other pods are not affected.

-Anand

From: Anand Krishnamoorthi <anakrish@microsoft.com>
Sent: Monday, April 3, 2023 2:26 PM
To: Kristen Carlson Accardi <kristen@linux.intel.com>; jarkko@kernel.org <jarkko@kernel.org>; dave.hansen@linux.intel.com <dave.hansen@linux.intel.com>; tj@kernel.org <tj@kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-sgx@vger.kernel.org <linux-sgx@vger.kernel.org>; cgroups@vger.kernel.org <cgroups@vger.kernel.org>; Bo Zhang (ACC) <zhanb@microsoft.com>
Cc: zhiquan1.li@intel.com <zhiquan1.li@intel.com>
Subject: Re: [EXTERNAL] [PATCH v2 00/18] Add Cgroup support for SGX EPC memory

Adding Bo Zhang to thread.

-Anand


From: Kristen Carlson Accardi <kristen@linux.intel.com>
Sent: Friday, December 2, 2022 10:36 AM
To: jarkko@kernel.org <jarkko@kernel.org>; dave.hansen@linux.intel.com <dave.hansen@linux.intel.com>; tj@kernel.org <tj@kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-sgx@vger.kernel.org <linux-sgx@vger.kernel.org>; cgroups@vger.kernel.org <cgroups@vger.kernel.org>
Cc: zhiquan1.li@intel.com <zhiquan1.li@intel.com>
Subject: [EXTERNAL] [PATCH v2 00/18] Add Cgroup support for SGX EPC memory

Utilize the Miscellaneous cgroup controller to regulate the distribution
of SGX EPC memory, which is a subset of system RAM that is used to provide
SGX-enabled applications with protected memory, and is otherwise inaccessible.

SGX EPC memory allocations are separate from normal RAM allocations,
and is managed solely by the SGX subsystem. The existing cgroup memory
controller cannot be used to limit or account for SGX EPC memory.

This patchset implements the support for sgx_epc memory within the
misc cgroup controller, and then utilizes the misc cgroup controller
to provide support for setting the total system capacity, max limit
per cgroup, and events.

This work was originally authored by Sean Christopherson a few years ago,
and was modified to work with more recent kernels, and to utilize the
misc cgroup controller rather than a custom controller. It is currently
based on top of the MCA patches.

Here's the MCA patchset for reference.
https://lore.kernel.org/linux-sgx/2d52c8c4-8ed0-6df2-2911-da5b9fcc9ae4@intel.com/T/#t

The patchset adds support for multiple LRUs to track both reclaimable
EPC pages (i.e. pages the reclaimer knows about), as well as unreclaimable
EPC pages (i.e. pages which the reclaimer isn't aware of, such as va pages).
These pages are assigned to an LRU, as well as an enclave, so that an
enclave's full EPC usage can be tracked, and limited to a max value. During
OOM events, an enclave can be have its memory zapped, and all the EPC pages
not tracked by the reclaimer can be freed.

I appreciate your comments and feedback.

Changelog:

v2:
 * rename struct sgx_epc_lru to sgx_epc_lru_lists to be more clear
   that this struct contains 2 lists.
 * use inline functions rather than macros for sgx_epc_page_list*
   wrappers.
 * Remove flags macros and open code all flags.
 * Improve the commit message for RECLAIM_IN_PROGRESS patch to make
   it more clear what the patch does.
 * remove notifier_block from misc cgroup changes and use a set
   of ops for callbacks instead.
 * rename root_misc to misc_cg_root and parent_misc to misc_cg_parent
 * consolidate misc cgroup changes to 2 patches and remove most of
   the previous helper functions.

Kristen Carlson Accardi (7):
  x86/sgx: Add 'struct sgx_epc_lru_lists' to encapsulate lru list(s)
  x86/sgx: Use sgx_epc_lru_lists for existing active page list
  x86/sgx: Track epc pages on reclaimable or unreclaimable lists
  cgroup/misc: Add per resource callbacks for css events
  cgroup/misc: Prepare for SGX usage
  x86/sgx: Add support for misc cgroup controller
  Docs/x86/sgx: Add description for cgroup support

Sean Christopherson (11):
  x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages()
  x86/sgx: Store struct sgx_encl when allocating new VA pages
  x86/sgx: Introduce RECLAIM_IN_PROGRESS flag for EPC pages
  x86/sgx: Use a list to track to-be-reclaimed pages during reclaim
  x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default
  x86/sgx: Return the number of EPC pages that were successfully
    reclaimed
  x86/sgx: Add option to ignore age of page during EPC reclaim
  x86/sgx: Prepare for multiple LRUs
  x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup
  x86/sgx: Add helper to grab pages from an arbitrary EPC LRU
  x86/sgx: Add EPC OOM path to forcefully reclaim EPC

 Documentation/x86/sgx.rst            |  77 ++++
 arch/x86/Kconfig                     |  13 +
 arch/x86/kernel/cpu/sgx/Makefile     |   1 +
 arch/x86/kernel/cpu/sgx/encl.c       |  90 ++++-
 arch/x86/kernel/cpu/sgx/encl.h       |   4 +-
 arch/x86/kernel/cpu/sgx/epc_cgroup.c | 539 +++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/epc_cgroup.h |  59 +++
 arch/x86/kernel/cpu/sgx/ioctl.c      |  14 +-
 arch/x86/kernel/cpu/sgx/main.c       | 412 ++++++++++++++++----
 arch/x86/kernel/cpu/sgx/sgx.h        | 122 +++++-
 arch/x86/kernel/cpu/sgx/virt.c       |  28 +-
 include/linux/misc_cgroup.h          |  35 ++
 kernel/cgroup/misc.c                 |  76 +++-
 13 files changed, 1341 insertions(+), 129 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c
 create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h

--
2.38.1

  reply	other threads:[~2023-04-13 18:50 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-02 18:36 [PATCH v2 00/18] Add Cgroup support for SGX EPC memory Kristen Carlson Accardi
2022-12-02 18:36 ` [PATCH v2 01/18] x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages() Kristen Carlson Accardi
2022-12-02 21:33   ` Dave Hansen
2022-12-02 21:37     ` Kristen Carlson Accardi
2022-12-02 21:45       ` Dave Hansen
2022-12-02 22:17         ` Kristen Carlson Accardi
2022-12-02 22:37           ` Dave Hansen
2022-12-02 18:36 ` [PATCH v2 02/18] x86/sgx: Store struct sgx_encl when allocating new VA pages Kristen Carlson Accardi
2022-12-02 21:35   ` Dave Hansen
2022-12-02 21:40     ` Kristen Carlson Accardi
2022-12-02 21:48       ` Dave Hansen
2022-12-02 22:35         ` Sean Christopherson
2022-12-02 22:47           ` Dave Hansen
2022-12-02 22:49             ` Sean Christopherson
2022-12-02 18:36 ` [PATCH v2 03/18] x86/sgx: Add 'struct sgx_epc_lru_lists' to encapsulate lru list(s) Kristen Carlson Accardi
2022-12-02 21:39   ` Dave Hansen
2022-12-08 15:31   ` Jarkko Sakkinen
2022-12-08 18:03     ` Kristen Carlson Accardi
2022-12-02 18:36 ` [PATCH v2 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list Kristen Carlson Accardi
2022-12-02 21:43   ` Dave Hansen
2022-12-02 21:51     ` Kristen Carlson Accardi
2022-12-02 22:10       ` Dave Hansen
2022-12-02 18:36 ` [PATCH v2 05/18] x86/sgx: Track epc pages on reclaimable or unreclaimable lists Kristen Carlson Accardi
2022-12-02 22:13   ` Dave Hansen
2022-12-02 22:28     ` Sean Christopherson
2022-12-02 18:36 ` [PATCH v2 06/18] x86/sgx: Introduce RECLAIM_IN_PROGRESS flag for EPC pages Kristen Carlson Accardi
2022-12-02 22:15   ` Dave Hansen
2022-12-08 15:46   ` Jarkko Sakkinen
2022-12-08 18:13     ` Kristen Carlson Accardi
2022-12-02 18:36 ` [PATCH v2 07/18] x86/sgx: Use a list to track to-be-reclaimed pages during reclaim Kristen Carlson Accardi
2022-12-02 22:33   ` Dave Hansen
2022-12-05 16:33     ` Kristen Carlson Accardi
2022-12-05 17:03       ` Dave Hansen
2022-12-05 18:25         ` Kristen Carlson Accardi
2022-12-02 18:36 ` [PATCH v2 08/18] x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default Kristen Carlson Accardi
2022-12-08  9:26   ` Jarkko Sakkinen
2022-12-08  9:27     ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 09/18] x86/sgx: Return the number of EPC pages that were successfully reclaimed Kristen Carlson Accardi
2022-12-08  9:30   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 10/18] x86/sgx: Add option to ignore age of page during EPC reclaim Kristen Carlson Accardi
2022-12-08  9:37   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 11/18] x86/sgx: Prepare for multiple LRUs Kristen Carlson Accardi
2022-12-08  9:42   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 12/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Kristen Carlson Accardi
2022-12-08  9:46   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 13/18] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Kristen Carlson Accardi
2022-12-08  9:56   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 14/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Kristen Carlson Accardi
2022-12-08 15:21   ` Jarkko Sakkinen
2022-12-09 16:05     ` Kristen Carlson Accardi
2022-12-09 16:22       ` Dave Hansen
2022-12-12 18:09         ` Sean Christopherson
2022-12-26 20:43           ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 15/18] cgroup/misc: Add per resource callbacks for css events Kristen Carlson Accardi
2022-12-08 14:53   ` Jarkko Sakkinen
2022-12-08 15:15     ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 16/18] cgroup/misc: Prepare for SGX usage Kristen Carlson Accardi
2022-12-08 15:23   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 17/18] x86/sgx: Add support for misc cgroup controller Kristen Carlson Accardi
2022-12-08 15:30   ` Jarkko Sakkinen
2022-12-02 18:36 ` [PATCH v2 18/18] Docs/x86/sgx: Add description for cgroup support Kristen Carlson Accardi
2023-04-03 21:26 ` [EXTERNAL] [PATCH v2 00/18] Add Cgroup support for SGX EPC memory Anand Krishnamoorthi
2023-04-13 18:49   ` Anand Krishnamoorthi [this message]
2023-04-18 16:44     ` Mikko Ylinen
2023-04-27 16:53       ` Anand Krishnamoorthi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR21MB11772A6ED915825854B419D6C4989@DM6PR21MB1177.namprd21.prod.outlook.com \
    --to=anakrish@microsoft.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=jarkko@kernel.org \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).