All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Haitao Huang" <haitao.huang@linux.intel.com>
To: dave.hansen@linux.intel.com, tj@kernel.org, mkoutny@suse.com,
	linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org,
	x86@kernel.org, cgroups@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	sohil.mehta@intel.com, "Jarkko Sakkinen" <jarkko@kernel.org>,
	"Haitao Huang" <haitao.huang@linux.intel.com>
Cc: zhiquan1.li@intel.com, kristen@linux.intel.com,
	seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com,
	mikko.ylinen@linux.intel.com, yangjie@microsoft.com
Subject: Re: [PATCH v6 00/12] Add Cgroup support for SGX EPC memory
Date: Tue, 07 Nov 2023 19:00:12 -0600	[thread overview]
Message-ID: <op.2d2fqmpswjvjmi@hhuan26-mobl.amr.corp.intel.com> (raw)
In-Reply-To: <op.2dzvirvlwjvjmi@hhuan26-mobl.amr.corp.intel.com>

On Mon, 06 Nov 2023 09:48:36 -0600, Haitao Huang  
<haitao.huang@linux.intel.com> wrote:

> On Sun, 05 Nov 2023 21:26:44 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
> wrote:
>
>> On Mon, 2023-10-30 at 11:20 -0700, Haitao Huang wrote:
>>> SGX Enclave Page Cache (EPC) memory allocations are separate from  
>>> normal RAM allocations, and
>>> are managed solely by the SGX subsystem. The existing cgroup memory  
>>> controller cannot be used
>>> to limit or account for SGX EPC memory, which is a desirable feature  
>>> in some environments,
>>> e.g., support for pod level control in a Kubernates cluster on a VM or  
>>> baremetal host [1,2].
>>>  This patchset implements the support for sgx_epc memory within the  
>>> misc cgroup controller. The
>>> user can use the misc cgroup controller to set and enforce a max limit  
>>> on total EPC usage per
>>> cgroup. The implementation reports current usage and events of  
>>> reaching the limit per cgroup as
>>> well as the total system capacity.
>>>  With the EPC misc controller enabled, every EPC page allocation is  
>>> accounted for a cgroup's
>>> usage, reflected in the 'sgx_epc' entry in the 'misc.current'  
>>> interface file of the cgroup.
>>> Much like normal system memory, EPC memory can be overcommitted via  
>>> virtual memory techniques
>>> and pages can be swapped out of the EPC to their backing store (normal  
>>> system memory allocated
>>> via shmem, accounted by the memory controller). When the EPC usage of  
>>> a cgroup reaches its hard
>>> limit ('sgx_epc' entry in the 'misc.max' file), the cgroup starts a  
>>> reclamation process to swap
>>> out some EPC pages within the same cgroup and its descendant to their  
>>> backing store. Although
>>> the SGX architecture supports swapping for all pages, to avoid extra  
>>> complexities, this
>>> implementation does not support swapping for certain page types, e.g.   
>>> Version Array(VA) pages,
>>> and treat them as unreclaimable pages.  When the limit is reached but  
>>> nothing left in the
>>> cgroup for reclamation, i.e., only unreclaimable pages left, any new  
>>> EPC allocation in the
>>> cgroup will result in an ENOMEM error.
>>>
>>> The EPC pages allocated for guest VMs by the virtual EPC driver are  
>>> not reclaimable by the host
>>> kernel [5]. Therefore they are also treated as unreclaimable from  
>>> cgroup's point of view.  And
>>> the virtual EPC driver translates an ENOMEM error resulted from an EPC  
>>> allocation request into
>>> a SIGBUS to the user process.
>>>
>>> This work was originally authored by Sean Christopherson a few years  
>>> ago, and previously
>>> modified by Kristen C. Accardi to utilize the misc cgroup controller  
>>> rather than a custom
>>> controller. I have been updating the patches based on review comments  
>>> since V2 [3, 4, 10],
>>> simplified the implementation/design and fixed some stability issues  
>>> found from testing.
>>>  The patches are organized as following:
>>> - Patches 1-3 are prerequisite misc cgroup changes for adding new  
>>> APIs, structs, resource
>>>   types.
>>> - Patch 4 implements basic misc controller for EPC without reclamation.
>>> - Patches 5-9 prepare for per-cgroup reclamation.
>>>     * Separate out the existing infrastructure of tracking reclaimable  
>>> pages
>>>       from the global reclaimer(ksgxd) to a newly created LRU list  
>>> struct.
>>>     * Separate out reusable top-level functions for reclamation.
>>> - Patch 10 adds support for per-cgroup reclamation.
>>> - Patch 11 adds documentation for the EPC cgroup.
>>> - Patch 12 adds test scripts.
>>>
>>> I appreciate your review and providing tags if appropriate.
>>>
>>> ---
>>> V6:
>>> - Dropped OOM killing path, only implement non-preemptive enforcement  
>>> of max limit (Dave, Michal)
>>> - Simplified reclamation flow by taking out sgx_epc_reclaim_control,  
>>> forced reclamation by
>>>   ignoring 'age".
>>> - Restructured patches: split misc API + resource types patch and the  
>>> big EPC cgroup patch
>>>   (Kai, Michal)
>>> - Dropped some Tested-by/Reviewed-by tags due to significant changes
>>> - Added more selftests
>>>
>>> v5:
>>> - Replace the manual test script with a selftest script.
>>> - Restore the "From" tag for some patches to Sean (Kai)
>>> - Style fixes (Jarkko)
>>>
>>> v4:
>>> - Collected "Tested-by" from Mikko. I kept it for now as no functional  
>>> changes in v4.
>>> - Rebased on to v6.6_rc1 and reordered patches as described above.
>>> - Separated out the bug fixes [7,8,9]. This series depend on those  
>>> patches. (Dave, Jarkko)
>>> - Added comments in commit message to give more preview what's to come  
>>> next. (Jarkko)
>>> - Fixed some documentation error, gap, style (Mikko, Randy)
>>> - Fixed some comments, typo, style in code (Mikko, Kai)
>>> - Patch format and background for reclaimable vs unreclaimable (Kai,  
>>> Jarkko)
>>> - Fixed typo (Pavel)
>>> - Exclude the previous fixes/enhancements for self-tests. Patch 18 now  
>>> depends on series [6]
>>> - Use the same to list for cover and all patches. (Solo)
>>>  v3:
>>>  - Added EPC states to replace flags in sgx_epc_page struct. (Jarkko)
>>> - Unrolled wrappers for cond_resched, list (Dave)
>>> - Separate patches for adding reclaimable and unreclaimable lists.  
>>> (Dave)
>>> - Other improvements on patch flow, commit messages, styles. (Dave,  
>>> Jarkko)
>>> - Simplified the cgroup tree walking with plain
>>>   css_for_each_descendant_pre.
>>> - Fixed race conditions and crashes.
>>> - OOM killer to wait for the victim enclave pages being reclaimed.
>>> - Unblock the user by handling misc_max_write callback asynchronously.
>>> - Rebased onto 6.4 and no longer base this series on the MCA patchset.
>>> - Fix an overflow in misc_try_charge.
>>> - Fix a NULL pointer in SGX PF handler.
>>> - Updated and included the SGX selftest patches previously reviewed.  
>>> Those
>>>   patches fix issues triggered in high EPC pressure required for cgroup
>>>   testing.
>>> - Added test scripts to help setup and test SGX EPC cgroups.
>>>   
>>> [1]https://lore.kernel.org/all/DM6PR21MB11772A6ED915825854B419D6C4989@DM6PR21MB1177.namprd21.prod.outlook.com/
>>> [2]https://lore.kernel.org/all/ZD7Iutppjj+muH4p@himmelriiki/
>>> [3]https://lore.kernel.org/all/20221202183655.3767674-1-kristen@linux.intel.com/
>>> [4]https://lore.kernel.org/linux-sgx/20230712230202.47929-1-haitao.huang@linux.intel.com/
>>> [5]Documentation/arch/x86/sgx.rst, Section "Virtual EPC"
>>> [6]https://lore.kernel.org/linux-sgx/20220905020411.17290-1-jarkko@kernel.org/
>>> [7]https://lore.kernel.org/linux-sgx/ZLcXmvDKheCRYOjG@slm.duckdns.org/
>>> [8]https://lore.kernel.org/linux-sgx/20230721120231.13916-1-haitao.huang@linux.intel.com/
>>> [9]https://lore.kernel.org/linux-sgx/20230728051024.33063-1-haitao.huang@linux.intel.com/
>>> [10]https://lore.kernel.org/all/20230923030657.16148-1-haitao.huang@linux.intel.com/
>>>
>>> Haitao Huang (2):
>>>   x86/sgx: Introduce EPC page states
>>>   selftests/sgx: Add scripts for EPC cgroup testing
>>>
>>> Kristen Carlson Accardi (5):
>>>   cgroup/misc: Add per resource callbacks for CSS events
>>>   cgroup/misc: Export APIs for SGX driver
>>>   cgroup/misc: Add SGX EPC resource type
>>>   x86/sgx: Implement basic EPC misc cgroup functionality
>>>   x86/sgx: Implement EPC reclamation for cgroup
>>>
>>> Sean Christopherson (5):
>>>   x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list
>>>   x86/sgx: Use sgx_epc_lru_list for existing active page list
>>>   x86/sgx: Use a list to track to-be-reclaimed pages
>>>   x86/sgx: Restructure top-level EPC reclaim function
>>>   Docs/x86/sgx: Add description for cgroup support
>>>
>>>  Documentation/arch/x86/sgx.rst                |  74 ++++
>>>  arch/x86/Kconfig                              |  13 +
>>>  arch/x86/kernel/cpu/sgx/Makefile              |   1 +
>>>  arch/x86/kernel/cpu/sgx/encl.c                |   2 +-
>>>  arch/x86/kernel/cpu/sgx/epc_cgroup.c          | 319 ++++++++++++++++++
>>>  arch/x86/kernel/cpu/sgx/epc_cgroup.h          |  49 +++
>>>  arch/x86/kernel/cpu/sgx/main.c                | 245 +++++++++-----
>>>  arch/x86/kernel/cpu/sgx/sgx.h                 |  88 ++++-
>>>  include/linux/misc_cgroup.h                   |  42 +++
>>>  kernel/cgroup/misc.c                          |  52 ++-
>>>  .../selftests/sgx/run_epc_cg_selftests.sh     | 196 +++++++++++
>>>  .../selftests/sgx/watch_misc_for_tests.sh     |  13 +
>>>  12 files changed, 996 insertions(+), 98 deletions(-)
>>>  create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c
>>>  create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h
>>>  create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh
>>>  create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh
>>>
>>
>> Is this expected to work on NUC7?
>>
>> Planning to test this next week (no time this week).
>>
>> BR, Jarkko
>
> I don't see a reason why it would not be working on a NUC. I'll try to  
> get access to one and test it too.

Tried on a NUC with about 90M EPC. The selftests worked fine.
BR
Haitao

  reply	other threads:[~2023-11-08  1:00 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30 18:20 [PATCH v6 00/12] Add Cgroup support for SGX EPC memory Haitao Huang
2023-10-30 18:20 ` [PATCH v6 01/12] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2023-11-15 20:25   ` Jarkko Sakkinen
2024-01-09  3:37     ` Haitao Huang
2024-01-10 19:55       ` Jarkko Sakkinen
2024-01-05  9:45   ` Michal Koutný
2024-01-06  1:42     ` Haitao Huang
2023-10-30 18:20 ` [PATCH v6 02/12] cgroup/misc: Export APIs for SGX driver Haitao Huang
2023-10-30 18:20 ` [PATCH v6 03/12] cgroup/misc: Add SGX EPC resource type Haitao Huang
2023-10-30 18:20 ` [PATCH v6 04/12] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2023-11-06 12:09   ` Huang, Kai
2023-11-06 18:59     ` Haitao Huang
2023-11-06 22:18       ` Huang, Kai
2023-11-07  1:16         ` Haitao Huang
2023-11-07  2:08           ` Haitao Huang
2023-11-07 19:07             ` Huang, Kai
2023-11-20  3:16             ` Huang, Kai
2023-11-26 16:01               ` Haitao Huang
2023-11-26 16:32                 ` Haitao Huang
2023-11-06 22:23   ` Huang, Kai
2023-11-15 20:48   ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 05/12] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2023-10-30 18:20 ` [PATCH v6 06/12] x86/sgx: Use sgx_epc_lru_list for existing active page list Haitao Huang
2023-10-30 18:20 ` [PATCH v6 07/12] x86/sgx: Introduce EPC page states Haitao Huang
2023-11-15 20:53   ` Jarkko Sakkinen
2024-01-05 17:57   ` Dave Hansen
2024-01-06  1:45     ` Haitao Huang
2023-10-30 18:20 ` [PATCH v6 08/12] x86/sgx: Use a list to track to-be-reclaimed pages Haitao Huang
2023-11-15 20:59   ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function Haitao Huang
2023-11-20  3:45   ` Huang, Kai
2023-11-26 16:27     ` Haitao Huang
2023-11-27  9:57       ` Huang, Kai
2023-12-12  4:04         ` Haitao Huang
2023-12-13 11:17           ` Huang, Kai
2023-12-15 19:49             ` Haitao Huang
2023-12-18  1:44               ` Huang, Kai
2023-12-18 17:32                 ` Mikko Ylinen
2023-12-18 21:24                 ` Haitao Huang
2024-01-03 16:37                   ` Dave Hansen
2024-01-04 19:11                     ` Haitao Huang
2024-01-04 19:19                       ` Jarkko Sakkinen
2024-01-04 19:27                       ` Dave Hansen
2024-01-04 21:01                         ` Haitao Huang
2024-01-05 14:43                       ` Mikko Ylinen
2024-01-04 12:38                   ` Michal Koutný
2024-01-04 19:20                     ` Haitao Huang
2024-01-12 17:07                 ` Haitao Huang
2024-01-13 21:04                   ` Jarkko Sakkinen
2024-01-13 21:08                     ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 10/12] x86/sgx: Implement EPC reclamation for cgroup Haitao Huang
2023-11-06 15:58   ` [PATCH] x86/sgx: Charge proper mem_cgroup for usage due to EPC reclamation by cgroups Haitao Huang
2023-11-06 16:10   ` [PATCH v6 10/12] x86/sgx: Implement EPC reclamation for cgroup Haitao Huang
2023-10-30 18:20 ` [PATCH v6 11/12] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2023-10-30 18:20 ` [PATCH v6 12/12] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2023-11-15 21:00   ` Jarkko Sakkinen
2023-11-15 21:22     ` Haitao Huang
2023-11-06  3:26 ` [PATCH v6 00/12] Add Cgroup support for SGX EPC memory Jarkko Sakkinen
2023-11-06 15:48   ` Haitao Huang
2023-11-08  1:00     ` Haitao Huang [this message]
2024-01-05 18:29 ` Dave Hansen
2024-01-05 20:13   ` Haitao Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.2d2fqmpswjvjmi@hhuan26-mobl.amr.corp.intel.com \
    --to=haitao.huang@linux.intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.