All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haitao Huang <haitao.huang@linux.intel.com>
To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org,
	mkoutny@suse.com, linux-kernel@vger.kernel.org,
	linux-sgx@vger.kernel.org, x86@kernel.org,
	cgroups@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, hpa@zytor.com, sohil.mehta@intel.com
Cc: zhiquan1.li@intel.com, kristen@linux.intel.com,
	seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com,
	mikko.ylinen@linux.intel.com, yangjie@microsoft.com
Subject: [PATCH v7 14/15] Docs/x86/sgx: Add description for cgroup support
Date: Mon, 22 Jan 2024 09:20:47 -0800	[thread overview]
Message-ID: <20240122172048.11953-15-haitao.huang@linux.intel.com> (raw)
In-Reply-To: <20240122172048.11953-1-haitao.huang@linux.intel.com>

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add initial documentation of how to regulate the distribution of
SGX Enclave Page Cache (EPC) memory via the Miscellaneous cgroup
controller.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Co-developed-by: Haitao Huang<haitao.huang@linux.intel.com>
Signed-off-by: Haitao Huang<haitao.huang@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
---
V6:
- Remove mentioning of VMM specific behavior on handling SIGBUS
- Remove statement of forced reclamation, add statement to specify
ENOMEM returned when no reclamation possible.
- Added statements on the non-preemptive nature for the max limit
- Dropped Reviewed-by tag because of changes

V4:
- Fix indentation (Randy)
- Change misc.events file to be read-only
- Fix a typo for 'subsystem'
- Add behavior when VMM overcommit EPC with a cgroup (Mikko)
---
 Documentation/arch/x86/sgx.rst | 74 ++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/Documentation/arch/x86/sgx.rst b/Documentation/arch/x86/sgx.rst
index d90796adc2ec..dfc8fac13ab2 100644
--- a/Documentation/arch/x86/sgx.rst
+++ b/Documentation/arch/x86/sgx.rst
@@ -300,3 +300,77 @@ to expected failures and handle them as follows:
    first call.  It indicates a bug in the kernel or the userspace client
    if any of the second round of ``SGX_IOC_VEPC_REMOVE_ALL`` calls has
    a return code other than 0.
+
+
+Cgroup Support
+==============
+
+The "sgx_epc" resource within the Miscellaneous cgroup controller regulates distribution of SGX
+EPC memory, which is a subset of system RAM that is used to provide SGX-enabled applications
+with protected memory, and is otherwise inaccessible, i.e. shows up as reserved in /proc/iomem
+and cannot be read/written outside of an SGX enclave.
+
+Although current systems implement EPC by stealing memory from RAM, for all intents and
+purposes the EPC is independent from normal system memory, e.g. must be reserved at boot from
+RAM and cannot be converted between EPC and normal memory while the system is running.  The EPC
+is managed by the SGX subsystem and is not accounted by the memory controller.  Note that this
+is true only for EPC memory itself, i.e.  normal memory allocations related to SGX and EPC
+memory, e.g. the backing memory for evicted EPC pages, are accounted, limited and protected by
+the memory controller.
+
+Much like normal system memory, EPC memory can be overcommitted via virtual memory techniques
+and pages can be swapped out of the EPC to their backing store (normal system memory allocated
+via shmem).  The SGX EPC subsystem is analogous to the memory subsystem, and it implements
+limit and protection models for EPC memory.
+
+SGX EPC Interface Files
+-----------------------
+
+For a generic description of the Miscellaneous controller interface files, please see
+Documentation/admin-guide/cgroup-v2.rst
+
+All SGX EPC memory amounts are in bytes unless explicitly stated otherwise.  If a value which
+is not PAGE_SIZE aligned is written, the actual value used by the controller will be rounded
+down to the closest PAGE_SIZE multiple.
+
+  misc.capacity
+        A read-only flat-keyed file shown only in the root cgroup.  The sgx_epc resource will
+        show the total amount of EPC memory available on the platform.
+
+  misc.current
+        A read-only flat-keyed file shown in the non-root cgroups.  The sgx_epc resource will
+        show the current active EPC memory usage of the cgroup and its descendants. EPC pages
+        that are swapped out to backing RAM are not included in the current count.
+
+  misc.max
+        A read-write single value file which exists on non-root cgroups. The sgx_epc resource
+        will show the EPC usage hard limit. The default is "max".
+
+        If a cgroup's EPC usage reaches this limit, EPC allocations, e.g. for page fault
+        handling, will be blocked until EPC can be reclaimed from the cgroup. If there are no
+        pages left that are reclaimable within the same group, the kernel returns ENOMEM.
+
+        The EPC pages allocated for a guest VM by the virtual EPC driver are not reclaimable by
+        the host kernel. In case the guest cgroup's limit is reached and no reclaimable pages
+        left in the same cgroup, the virtual EPC driver returns SIGBUS to the user space
+        process to indicate failure on new EPC allocation requests.
+
+        The misc.max limit is non-preemptive. If a user writes a limit lower than the current
+        usage to this file, the cgroup will not preemptively deallocate pages currently in use,
+        and will only start blocking the next allocation and reclaiming EPC at that time.
+
+  misc.events
+        A read-only flat-keyed file which exists on non-root cgroups.
+        A value change in this file generates a file modified event.
+
+          max
+                The number of times the cgroup has triggered a reclaim
+                due to its EPC usage approaching (or exceeding) its max
+                EPC boundary.
+
+Migration
+---------
+
+Once an EPC page is charged to a cgroup (during allocation), it remains charged to the original
+cgroup until the page is released or reclaimed.  Migrating a process to a different cgroup
+doesn't move the EPC charges that it incurred while in the previous cgroup to its new cgroup.
-- 
2.25.1


  parent reply	other threads:[~2024-01-22 17:20 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-22 17:20 [PATCH v7 00/15] Add Cgroup support for SGX EPC memory Haitao Huang
2024-01-22 17:20 ` [PATCH v7 01/15] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2024-01-22 20:14   ` Jarkko Sakkinen
2024-01-23 16:19     ` Haitao Huang
2024-01-22 17:20 ` [PATCH v7 02/15] cgroup/misc: Export APIs for SGX driver Haitao Huang
2024-01-22 20:17   ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 03/15] cgroup/misc: Add SGX EPC resource type Haitao Huang
2024-01-22 20:18   ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 04/15] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2024-01-22 20:25   ` Jarkko Sakkinen
2024-01-23 16:04     ` Haitao Huang
2024-01-24  3:29     ` Haitao Huang
2024-02-01 23:21       ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 05/15] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2024-01-22 17:20 ` [PATCH v7 06/15] x86/sgx: Abstract tracking reclaimable pages in LRU Haitao Huang
2024-01-22 17:20 ` [PATCH v7 07/15] x86/sgx: Expose sgx_reclaim_pages() for cgroup Haitao Huang
2024-01-22 20:28   ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 08/15] x86/sgx: Implement EPC reclamation flows " Haitao Huang
2024-01-22 20:29   ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 09/15] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Haitao Huang
2024-01-22 20:30   ` Jarkko Sakkinen
2024-01-26 14:37   ` Huang, Kai
2024-01-26 16:21     ` Haitao Huang
2024-02-02 23:45   ` Tim Chen
2024-02-03  0:39     ` Haitao Huang
2024-01-22 17:20 ` [PATCH v7 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge() Haitao Huang
2024-01-22 17:20 ` [PATCH v7 11/15] x86/sgx: Abstract check for global reclaimable pages Haitao Huang
2024-01-22 17:20 ` [PATCH v7 12/15] x86/sgx: Expose sgx_epc_cgroup_reclaim_pages() for global reclaimer Haitao Huang
2024-01-22 20:31   ` Jarkko Sakkinen
2024-01-22 17:20 ` [PATCH v7 13/15] x86/sgx: Turn on per-cgroup EPC reclamation Haitao Huang
2024-01-22 17:20 ` Haitao Huang [this message]
2024-01-22 17:20 ` [PATCH v7 15/15] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240122172048.11953-15-haitao.huang@linux.intel.com \
    --to=haitao.huang@linux.intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.