linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: Haitao Huang <haitao.huang@linux.intel.com>,
	"tj@kernel.org" <tj@kernel.org>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tim.c.chen@linux.intel.com" <tim.c.chen@linux.intel.com>,
	"mkoutny@suse.com" <mkoutny@suse.com>,
	"Mehta, Sohil" <sohil.mehta@intel.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"bp@alien8.de" <bp@alien8.de>
Cc: "mikko.ylinen@linux.intel.com" <mikko.ylinen@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"anakrish@microsoft.com" <anakrish@microsoft.com>,
	"Zhang, Bo" <zhanb@microsoft.com>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>,
	"yangjie@microsoft.com" <yangjie@microsoft.com>,
	"Li, Zhiquan1" <zhiquan1.li@intel.com>,
	"chrisyan@microsoft.com" <chrisyan@microsoft.com>
Subject: Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge()
Date: Tue, 27 Feb 2024 11:24:28 +1300	[thread overview]
Message-ID: <010f5c8e-6e63-4b37-82d7-ba755f989755@intel.com> (raw)
In-Reply-To: <op.2jrpgcufwjvjmi@hhuan26-mobl.amr.corp.intel.com>



On 27/02/2024 10:18 am, Haitao Huang wrote:
> On Mon, 26 Feb 2024 05:36:02 -0600, Huang, Kai <kai.huang@intel.com> wrote:
> 
>> On Sun, 2024-02-25 at 22:03 -0600, Haitao Huang wrote:
>>> On Sun, 25 Feb 2024 19:38:26 -0600, Huang, Kai <kai.huang@intel.com> 
>>> wrote:
>>>
>>> >
>>> >
>>> > On 24/02/2024 6:00 am, Haitao Huang wrote:
>>> > > On Fri, 23 Feb 2024 04:18:18 -0600, Huang, Kai <kai.huang@intel.com>
>>> > > wrote:
>>> > >
>>> > > > > >
>>> > > > > Right. When code reaches to here, we already passed reclaim per
>>> > > > > cgroup.
>>> > > >
>>> > > > Yes if try_charge() failed we must do pre-cgroup reclaim.
>>> > > >
>>> > > > > The cgroup may not at or reach limit but system has run out of
>>> > > > > physical
>>> > > > > EPC.
>>> > > > >
>>> > > >
>>> > > > But after try_charge() we can still choose to reclaim from the 
>>> current
>>> > > > group,
>>> > > > but not necessarily have to be global, right?  I am not sure 
>>> whether I
>>> > > > am
>>> > > > missing something, but could you elaborate why we should choose to
>>> > > > reclaim from
>>> > > > the global?
>>> > > >
>>> > >  Once try_charge is done and returns zero that means the cgroup 
>>> usage
>>> > > is charged and it's not over usage limit. So you really can't 
>>> reclaim
>>> > > from that cgroup if allocation failed. The only  thing you can do 
>>> is to
>>> > > reclaim globally.
>>> >
>>> > Sorry I still cannot establish the logic here.
>>> >
>>> > Let's say the sum of all cgroups are greater than the physical EPC, 
>>> and
>>> > elclave(s) in each cgroup could potentially fault w/o reaching 
>>> cgroup's
>>> > limit.
>>> >
>>> > In this case, when enclave(s) in one cgroup faults, why we cannot
>>> > reclaim from the current cgroup, but have to reclaim from global?
>>> >
>>> > Is there any real downside of the former, or you just want to 
>>> follow the
>>> > reclaim logic w/o cgroup at all?
>>> >
>>> > IIUC, there's at least one advantage of reclaim from the current 
>>> group,
>>> > that faults of enclave(s) in one group won't impact other enclaves in
>>> > other cgroups.  E.g., in this way other enclaves in other groups may
>>> > never need to trigger faults.
>>> >
>>> > Or perhaps I am missing anything?
>>> >
>>> The use case here is that user knows it's OK for group A to borrow some
>>> pages from group B for some time without impact much performance, vice
>>> versa. That's why the user is overcomitting so system can run more
>>> enclave/groups. Otherwise, if she is concerned about impact of A on 
>>> B, she
>>> could lower limit for A so it never interfere or interfere less with B
>>> (assume the lower limit is still high enough to run all enclaves in A),
>>> and sacrifice some of A's performance. Or if she does not want any
>>> interference between groups, just don't over-comit. So we don't really
>>> lose anything here.
>>
>> But if we reclaim from the same group, seems we could enable a user 
>> case that
>> allows the admin to ensure certain group won't be impacted at all, while
>> allowing other groups to over-commit?
>>
>> E.g., let's say we have 100M physical EPC.  And let's say the admin 
>> wants to run
>> some performance-critical enclave(s) which costs 50M EPC w/o being 
>> impacted.
>> The admin also wants to run other enclaves which could cost 100M EPC 
>> in total
>> but EPC swapping among them is acceptable.
>>
>> If we choose to reclaim from the current EPC cgroup, then seems to 
>> that the
>> admin can achieve the above by setting up 2 groups with group1 having 
>> 50M limit
>> and group2 having 100M limit, and then run performance-critical 
>> enclave(s) in
>> group1 and others in group2?  Or am I missing anything?
>>
> 
> The more important groups should have limits higher than or equal to 
> peak usage to ensure no impact.

Yes.  But if you do global reclaim there's no guarantee of this 
regardless of the limit setting.  It depends on setting of limits of 
other groups.

> The less important groups should have lower limits than its peak usage 
> to avoid impacting higher priority groups.

Yeah, but depending on how low the limit is, the try_charge() can still 
succeed but physical EPC is already running out.

Are you saying we should always expect the admin to set limits of groups 
not exceeding the physical EPC?

> The limit is the maximum usage allowed.
> 
> By setting group2 limit to 100M, you are allowing it to use 100M. So as 
> soon as it gets up and consume 100M, group1 can not even load any 
> enclave if we only reclaim per-cgroup and do not do global reclaim.

I kinda forgot, but I think SGX supports swapping out EPC of an enclave 
before EINIT?  Also, with SGX2 the initial enclave can take less EPC to 
be loaded.

> 
>> If we choose to do global reclaim, then we cannot achieve that.
> 
> 
> You can achieve this by setting group 2 limit to 50M. No need to 
> overcommiting to the system.
> Group 2 will swap as soon as it hits 50M, which is the maximum it can 
> consume so no impact to group 1.

Right.  We can achieve this by doing so.  But as said above, you are 
depending on setting up the limit to do per-cgorup reclaim.

So, back to the question:

What is the downside of doing per-group reclaim when try_charge() 
succeeds for the enclave but failed to allocate EPC page?

Could you give an complete answer why you choose to use global reclaim 
for the above case?

  reply	other threads:[~2024-02-26 22:24 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-05 21:06 [PATCH v9 00/15] Add Cgroup support for SGX EPC memory Haitao Huang
2024-02-05 21:06 ` [PATCH v9 01/15] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2024-02-05 21:06 ` [PATCH v9 02/15] cgroup/misc: Export APIs for SGX driver Haitao Huang
2024-02-05 21:06 ` [PATCH v9 03/15] cgroup/misc: Add SGX EPC resource type Haitao Huang
2024-02-05 21:06 ` [PATCH v9 04/15] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2024-02-19 12:47   ` Huang, Kai
2024-02-26 18:25   ` Michal Koutný
2024-02-27 21:35     ` Haitao Huang
2024-03-09 21:10       ` Haitao Huang
2024-02-05 21:06 ` [PATCH v9 05/15] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2024-02-05 21:06 ` [PATCH v9 06/15] x86/sgx: Abstract tracking reclaimable pages in LRU Haitao Huang
2024-02-05 21:06 ` [PATCH v9 07/15] x86/sgx: Expose sgx_reclaim_pages() for cgroup Haitao Huang
2024-02-20  9:26   ` Huang, Kai
2024-02-05 21:06 ` [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows " Haitao Huang
2024-02-12 19:35   ` Jarkko Sakkinen
2024-02-20  9:52   ` Huang, Kai
2024-02-20 13:18     ` Michal Koutný
2024-02-20 20:09       ` Huang, Kai
2024-02-21  6:23     ` Haitao Huang
2024-02-21 10:48       ` Huang, Kai
2024-02-22 20:12         ` Haitao Huang
2024-02-22 22:24           ` Huang, Kai
2024-03-28  0:24             ` Haitao Huang
2024-02-21  6:44     ` Haitao Huang
2024-02-21 11:00       ` Huang, Kai
2024-02-22 17:20         ` Haitao Huang
2024-02-22 22:31           ` Huang, Kai
2024-02-22 18:09     ` Haitao Huang
2024-02-05 21:06 ` [PATCH v9 09/15] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Haitao Huang
2024-02-12 19:46   ` Jarkko Sakkinen
2024-02-13  3:21     ` Haitao Huang
2024-02-15 23:43   ` Dave Hansen
2024-02-16  6:07     ` Haitao Huang
2024-02-16 15:15   ` Dave Hansen
2024-02-16 21:38     ` Haitao Huang
2024-02-16 21:55       ` Dave Hansen
2024-02-16 23:33         ` Haitao Huang
2024-02-05 21:06 ` [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge() Haitao Huang
2024-02-12 19:55   ` Jarkko Sakkinen
2024-02-12 23:15     ` Haitao Huang
2024-02-14  1:52       ` Jarkko Sakkinen
2024-02-19 15:12         ` Haitao Huang
2024-02-19 20:20           ` Jarkko Sakkinen
2024-02-19 15:39         ` [RFC PATCH] x86/sgx: Remove 'reclaim' boolean parameters Haitao Huang
2024-02-19 15:56           ` Dave Hansen
2024-02-19 20:42             ` Jarkko Sakkinen
2024-02-19 22:25               ` Haitao Huang
2024-02-19 22:43                 ` Jarkko Sakkinen
2024-02-19 20:23           ` Jarkko Sakkinen
2024-02-21 11:06   ` [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge() Huang, Kai
2024-02-22 17:09     ` Haitao Huang
2024-02-22 21:26       ` Huang, Kai
2024-02-22 22:57         ` Haitao Huang
2024-02-23 10:18           ` Huang, Kai
2024-02-23 17:00             ` Haitao Huang
2024-02-26  1:38               ` Huang, Kai
2024-02-26  4:03                 ` Haitao Huang
2024-02-26 11:36                   ` Huang, Kai
2024-02-26 14:04                     ` Dave Hansen
2024-02-26 21:48                       ` Haitao Huang
2024-02-26 21:56                         ` Dave Hansen
2024-02-26 22:34                           ` Huang, Kai
2024-02-26 22:38                             ` Dave Hansen
2024-02-26 22:46                               ` Huang, Kai
2024-02-27 20:41                           ` Jarkko Sakkinen
2024-02-27  9:26                         ` Michal Koutný
2024-02-26 21:18                     ` Haitao Huang
2024-02-26 22:24                       ` Huang, Kai [this message]
2024-02-26 22:31                         ` Dave Hansen
2024-02-26 22:38                           ` Huang, Kai
2024-02-05 21:06 ` [PATCH v9 11/15] x86/sgx: Abstract check for global reclaimable pages Haitao Huang
2024-02-12 19:56   ` Jarkko Sakkinen
2024-02-21 11:34   ` Huang, Kai
2024-02-05 21:06 ` [PATCH v9 12/15] x86/sgx: Expose sgx_epc_cgroup_reclaim_pages() for global reclaimer Haitao Huang
2024-02-12 19:58   ` Jarkko Sakkinen
2024-02-21 11:10   ` Huang, Kai
2024-02-22 16:35     ` Haitao Huang
2024-02-05 21:06 ` [PATCH v9 13/15] x86/sgx: Turn on per-cgroup EPC reclamation Haitao Huang
2024-02-21 11:23   ` Huang, Kai
2024-02-22 16:36     ` Haitao Huang
2024-02-22 22:44       ` Huang, Kai
2024-02-23 18:46         ` Haitao Huang
2024-02-05 21:06 ` [PATCH v9 14/15] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2024-02-05 21:06 ` [PATCH v9 15/15] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2024-03-27 12:55   ` Jarkko Sakkinen
2024-03-27 16:56     ` Jarkko Sakkinen
2024-03-28  0:57       ` Haitao Huang
2024-03-28  3:05         ` Haitao Huang
2024-03-30 11:23         ` Jarkko Sakkinen
2024-03-30 11:26           ` Jarkko Sakkinen
2024-04-02 11:23             ` Michal Koutný
2024-04-02 11:58               ` Jarkko Sakkinen
2024-04-02 16:20                 ` Haitao Huang
2024-04-02 17:40                   ` Michal Koutný
2024-04-02 18:20                     ` Haitao Huang
2024-04-03 16:46                     ` Jarkko Sakkinen
2024-04-03 15:33                   ` Jarkko Sakkinen
2024-04-02 15:42           ` Dave Hansen
2024-04-03 15:16             ` Jarkko Sakkinen
2024-03-28  3:54     ` Haitao Huang
2024-03-30 11:15       ` Jarkko Sakkinen
2024-03-30 15:32         ` Haitao Huang
2024-03-31 16:19           ` Jarkko Sakkinen
2024-03-31 17:35             ` Haitao Huang
2024-04-01 14:10               ` Jarkko Sakkinen
2024-02-08  8:43 ` [PATCH v9 00/15] Add Cgroup support for SGX EPC memory Mikko Ylinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=010f5c8e-6e63-4b37-82d7-ba755f989755@intel.com \
    --to=kai.huang@intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=chrisyan@microsoft.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=haitao.huang@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).