All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haitao Huang <haitao.huang@linux.intel.com>
To: jarkko@kernel.org, dave.hansen@linux.intel.com,
	kai.huang@intel.com, tj@kernel.org, mkoutny@suse.com,
	linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org,
	x86@kernel.org, cgroups@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	sohil.mehta@intel.com, tim.c.chen@linux.intel.com
Cc: zhiquan1.li@intel.com, kristen@linux.intel.com,
	seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com,
	mikko.ylinen@linux.intel.com, yangjie@microsoft.com,
	chrisyan@microsoft.com
Subject: [PATCH v11 13/14] Docs/x86/sgx: Add description for cgroup support
Date: Wed, 10 Apr 2024 11:25:57 -0700	[thread overview]
Message-ID: <20240410182558.41467-14-haitao.huang@linux.intel.com> (raw)
In-Reply-To: <20240410182558.41467-1-haitao.huang@linux.intel.com>

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add initial documentation of how to regulate the distribution of
SGX Enclave Page Cache (EPC) memory via the Miscellaneous cgroup
controller.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Co-developed-by: Haitao Huang<haitao.huang@linux.intel.com>
Signed-off-by: Haitao Huang<haitao.huang@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
---
V8:
- Limit text width to 80 characters to be consistent.

V6:
- Remove mentioning of VMM specific behavior on handling SIGBUS
- Remove statement of forced reclamation, add statement to specify
ENOMEM returned when no reclamation possible.
- Added statements on the non-preemptive nature for the max limit
- Dropped Reviewed-by tag because of changes

V4:
- Fix indentation (Randy)
- Change misc.events file to be read-only
- Fix a typo for 'subsystem'
- Add behavior when VMM overcommit EPC with a cgroup (Mikko)
---
 Documentation/arch/x86/sgx.rst | 83 ++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/Documentation/arch/x86/sgx.rst b/Documentation/arch/x86/sgx.rst
index d90796adc2ec..c537e6a9aa65 100644
--- a/Documentation/arch/x86/sgx.rst
+++ b/Documentation/arch/x86/sgx.rst
@@ -300,3 +300,86 @@ to expected failures and handle them as follows:
    first call.  It indicates a bug in the kernel or the userspace client
    if any of the second round of ``SGX_IOC_VEPC_REMOVE_ALL`` calls has
    a return code other than 0.
+
+
+Cgroup Support
+==============
+
+The "sgx_epc" resource within the Miscellaneous cgroup controller regulates
+distribution of SGX EPC memory, which is a subset of system RAM that is used to
+provide SGX-enabled applications with protected memory, and is otherwise
+inaccessible, i.e. shows up as reserved in /proc/iomem and cannot be
+read/written outside of an SGX enclave.
+
+Although current systems implement EPC by stealing memory from RAM, for all
+intents and purposes the EPC is independent from normal system memory, e.g. must
+be reserved at boot from RAM and cannot be converted between EPC and normal
+memory while the system is running.  The EPC is managed by the SGX subsystem and
+is not accounted by the memory controller.  Note that this is true only for EPC
+memory itself, i.e.  normal memory allocations related to SGX and EPC memory,
+e.g. the backing memory for evicted EPC pages, are accounted, limited and
+protected by the memory controller.
+
+Much like normal system memory, EPC memory can be overcommitted via virtual
+memory techniques and pages can be swapped out of the EPC to their backing store
+(normal system memory allocated via shmem).  The SGX EPC subsystem is analogous
+to the memory subsystem, and it implements limit and protection models for EPC
+memory.
+
+SGX EPC Interface Files
+-----------------------
+
+For a generic description of the Miscellaneous controller interface files,
+please see Documentation/admin-guide/cgroup-v2.rst
+
+All SGX EPC memory amounts are in bytes unless explicitly stated otherwise. If
+a value which is not PAGE_SIZE aligned is written, the actual value used by the
+controller will be rounded down to the closest PAGE_SIZE multiple.
+
+  misc.capacity
+        A read-only flat-keyed file shown only in the root cgroup. The sgx_epc
+        resource will show the total amount of EPC memory available on the
+        platform.
+
+  misc.current
+        A read-only flat-keyed file shown in the non-root cgroups. The sgx_epc
+        resource will show the current active EPC memory usage of the cgroup and
+        its descendants. EPC pages that are swapped out to backing RAM are not
+        included in the current count.
+
+  misc.max
+        A read-write single value file which exists on non-root cgroups. The
+        sgx_epc resource will show the EPC usage hard limit. The default is
+        "max".
+
+        If a cgroup's EPC usage reaches this limit, EPC allocations, e.g., for
+        page fault handling, will be blocked until EPC can be reclaimed from the
+        cgroup. If there are no pages left that are reclaimable within the same
+        group, the kernel returns ENOMEM.
+
+        The EPC pages allocated for a guest VM by the virtual EPC driver are not
+        reclaimable by the host kernel. In case the guest cgroup's limit is
+        reached and no reclaimable pages left in the same cgroup, the virtual
+        EPC driver returns SIGBUS to the user space process to indicate failure
+        on new EPC allocation requests.
+
+        The misc.max limit is non-preemptive. If a user writes a limit lower
+        than the current usage to this file, the cgroup will not preemptively
+        deallocate pages currently in use, and will only start blocking the next
+        allocation and reclaiming EPC at that time.
+
+  misc.events
+        A read-only flat-keyed file which exists on non-root cgroups.
+        A value change in this file generates a file modified event.
+
+          max
+                The number of times the cgroup has triggered a reclaim due to
+                its EPC usage approaching (or exceeding) its max EPC boundary.
+
+Migration
+---------
+
+Once an EPC page is charged to a cgroup (during allocation), it remains charged
+to the original cgroup until the page is released or reclaimed. Migrating a
+process to a different cgroup doesn't move the EPC charges that it incurred
+while in the previous cgroup to its new cgroup.
-- 
2.25.1


  parent reply	other threads:[~2024-04-10 18:26 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10 18:25 [PATCH v11 00/14] Add Cgroup support for SGX EPC memory Haitao Huang
2024-04-10 18:25 ` [PATCH v11 01/14] x86/sgx: Replace boolean parameters with enums Haitao Huang
2024-04-15 13:22   ` Huang, Kai
2024-04-15 19:10     ` Jarkko Sakkinen
2024-04-10 18:25 ` [PATCH v11 02/14] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2024-04-15 13:43   ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 03/14] cgroup/misc: Export APIs for SGX driver Haitao Huang
2024-04-15 13:45   ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 04/14] cgroup/misc: Add SGX EPC resource type Haitao Huang
2024-04-15 13:49   ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 05/14] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2024-04-10 18:25 ` [PATCH v11 06/14] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2024-04-15 13:51   ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 07/14] x86/sgx: Abstract tracking reclaimable pages in LRU Haitao Huang
2024-04-10 18:25 ` [PATCH v11 08/14] x86/sgx: Add basic EPC reclamation flow for cgroup Haitao Huang
2024-04-10 18:25 ` [PATCH v11 09/14] x86/sgx: Implement async reclamation " Haitao Huang
2024-04-10 18:25 ` [PATCH v11 10/14] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Haitao Huang
2024-04-10 18:25 ` [PATCH v11 11/14] x86/sgx: Abstract check for global reclaimable pages Haitao Huang
2024-04-10 18:25 ` [PATCH v11 12/14] x86/sgx: Turn on per-cgroup EPC reclamation Haitao Huang
2024-04-10 18:25 ` Haitao Huang [this message]
2024-04-10 18:25 ` [PATCH v11 14/14] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2024-04-13 21:34   ` Jarkko Sakkinen
2024-04-15 17:32     ` Haitao Huang
2024-04-15 19:12       ` Jarkko Sakkinen
2024-04-14 15:01   ` Jarkko Sakkinen
2024-04-15  3:13     ` Haitao Huang
2024-04-15 19:08       ` Jarkko Sakkinen
2024-04-15 19:28         ` Haitao Huang
2024-04-22 19:38         ` Haitao Huang
2024-04-13  6:48 ` [PATCH v11 00/14] Add Cgroup support for SGX EPC memory Mikko Ylinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240410182558.41467-14-haitao.huang@linux.intel.com \
    --to=haitao.huang@linux.intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=chrisyan@microsoft.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.