All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Haitao Huang" <haitao.huang@linux.intel.com>
To: "mingo@redhat.com" <mingo@redhat.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"tj@kernel.org" <tj@kernel.org>,
	"Mehta, Sohil" <sohil.mehta@intel.com>,
	"Huang, Kai" <kai.huang@intel.com>
Cc: "kristen@linux.intel.com" <kristen@linux.intel.com>,
	"anakrish@microsoft.com" <anakrish@microsoft.com>,
	"Li, Zhiquan1" <zhiquan1.li@intel.com>, "Christopherson,,
	Sean" <seanjc@google.com>,
	"mikko.ylinen@linux.intel.com" <mikko.ylinen@linux.intel.com>,
	"yangjie@microsoft.com" <yangjie@microsoft.com>,
	"Zhang, Bo" <zhanb@microsoft.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
Date: Mon, 09 Oct 2023 20:38:12 -0500	[thread overview]
Message-ID: <op.2ckr5yetwjvjmi@hhuan26-mobl.amr.corp.intel.com> (raw)
In-Reply-To: <fc6aa778ddbde9536cafe48b847cf6c45b640ea4.camel@intel.com>

On Mon, 09 Oct 2023 20:18:00 -0500, Huang, Kai <kai.huang@intel.com> wrote:

> On Mon, 2023-10-09 at 20:04 -0500, Haitao Huang wrote:
>> On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai <kai.huang@intel.com>  
>> wrote:
>>
>> > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote:
>> > > From: Sean Christopherson <sean.j.christopherson@intel.com>
>> > >
>> > > Introduce the OOM path for killing an enclave with a reclaimer that  
>> is
>> > > no
>> > > longer able to reclaim enough EPC pages. Find a victim enclave,  
>> which
>> > > will be an enclave with only "unreclaimable" EPC pages left in the
>> > > cgroup LRU lists. Once a victim is identified, mark the enclave as  
>> OOM
>> > > and zap the enclave's entire page range, and drain all mm  
>> references in
>> > > encl->mm_list. Block allocating any EPC pages in #PF handler, or
>> > > reloading any pages in all paths, or creating any new mappings.
>> > >
>> > > The OOM killing path may race with the reclaimers: in some cases,  
>> the
>> > > victim enclave is in the process of reclaiming the last EPC pages  
>> when
>> > > OOM happens, that is, all pages other than SECS and VA pages are in
>> > > RECLAIMING_IN_PROGRESS state. The reclaiming process requires  
>> access to
>> > > the enclave backing, VA pages as well as SECS. So the OOM killer  
>> does
>> > > not directly release those enclave resources, instead, it lets all
>> > > reclaiming in progress to finish, and relies (as currently done) on
>> > > kref_put on encl->refcount to trigger sgx_encl_release() to do the
>> > > final cleanup.
>> > >
>> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> > > Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
>> > > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
>> > > Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com>
>> > > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>> > > Cc: Sean Christopherson <seanjc@google.com>
>> > > ---
>> > > V5:
>> > > - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY
>> > >
>> > > V4:
>> > > - Updates for patch reordering and typo fixes.
>> > >
>> > > V3:
>> > > - Rebased to use the new VMA_ITERATOR to zap VMAs.
>> > > - Fixed the racing cases by blocking new page allocation/mapping and
>> > > reloading when enclave is marked for OOM. And do not release any  
>> enclave
>> > > resources other than draining mm_list entries, and let pages in
>> > > RECLAIMING_IN_PROGRESS to be reaped by reclaimers.
>> > > - Due to above changes, also removed the no-longer needed  
>> encl->lock in
>> > > the OOM path which was causing deadlocks reported by the lock  
>> prover.
>> > >
>> >
>> > [...]
>> >
>> > > +
>> > > +/**
>> > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU
>> > > + * @lru:	LRU that is low
>> > > + *
>> > > + * Return:	%true if a victim was found and kicked.
>> > > + */
>> > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru)
>> > > +{
>> > > +	struct sgx_epc_page *victim;
>> > > +
>> > > +	spin_lock(&lru->lock);
>> > > +	victim = sgx_oom_get_victim(lru);
>> > > +	spin_unlock(&lru->lock);
>> > > +
>> > > +	if (!victim)
>> > > +		return false;
>> > > +
>> > > +	if (victim->flags & SGX_EPC_OWNER_PAGE)
>> > > +		return sgx_oom_encl_page(victim->encl_page);
>> > > +
>> > > +	if (victim->flags & SGX_EPC_OWNER_ENCL)
>> > > +		return sgx_oom_encl(victim->encl);
>> >
>> > I hate to bring this up, at least at this stage, but I am wondering  
>> why
>> > we need
>> > to put VA and SECS pages to the unreclaimable list, but cannot keep an
>> > "enclave_list" instead?
>> >
>> > So by looking the patch (" x86/sgx: Limit process EPC usage with misc
>> > cgroup
>> > controller"), if I am not missing anything, the whole "unreclaimable"
>> > list is
>> > just used to find the victim enclave when OOM needs to be done.   
>> Thus, I
>> > don't
>> > see why "enclave_list" cannot be used to achieve this.
>> >
>> > The reason that I am asking is because it seems using "enclave_list"  
>> we
>> > can
>> > simplify the code.  At least the patches related to track VA/SECS  
>> pages,
>> > and the
>> > SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated
>> > completely.
>> > Using "enclave_list", I guess you just need to put the enclave to the
>> > current
>> > EPC cgroup when SECS page is allocated.
>> >
>> Later the hosting process could migrated/reassigned to another cgroup?
>> What to do when the new cgroup is OOM?
>>
>
> You addressed in the documentation, no?
>
> +Migration
> +---------
> +
> +Once an EPC page is charged to a cgroup (during allocation), it
> +remains charged to the original cgroup until the page is released
> +or reclaimed.  Migrating a process to a different cgroup doesn't
> +move the EPC charges that it incurred while in the previous cgroup
> +to its new cgroup.

Should we kill the enclave though because some VA pages may be in the new  
group?

Haitao

  reply	other threads:[~2023-10-10  1:38 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-23  3:06 [PATCH v5 00/18] Add Cgroup support for SGX EPC memory Haitao Huang
2023-09-23  3:06 ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 01/18] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:09   ` Jarkko Sakkinen
2023-09-25 17:09     ` Jarkko Sakkinen
2023-09-26  3:04     ` Haitao Huang
2023-09-26 13:10       ` Jarkko Sakkinen
2023-09-26 13:10         ` Jarkko Sakkinen
2023-09-26 13:13         ` Jarkko Sakkinen
2023-09-26 13:13           ` Jarkko Sakkinen
2023-09-27  1:56           ` Haitao Huang
2023-10-02 22:47             ` Jarkko Sakkinen
2023-10-02 22:55               ` Jarkko Sakkinen
2023-10-04 15:45                 ` Haitao Huang
2023-10-04 17:18                   ` Tejun Heo
2023-09-27  9:20   ` Huang, Kai
2023-10-03 14:29     ` Haitao Huang
2023-10-17 18:55   ` Michal Koutný
2023-09-23  3:06 ` [PATCH v5 02/18] cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 18:50   ` Tejun Heo
2023-09-25 18:50     ` Tejun Heo
2023-09-28  3:59   ` Huang, Kai
2023-10-03  7:00     ` Haitao Huang
2023-10-03 19:33       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 03/18] x86/sgx: Add sgx_epc_lru_lists to encapsulate LRU lists Haitao Huang
2023-09-23  3:06 ` [PATCH v5 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 05/18] x86/sgx: Store reclaimable EPC pages in sgx_epc_lru_lists Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 10:14   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 06/18] x86/sgx: Introduce EPC page states Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:11   ` Jarkko Sakkinen
2023-09-25 17:11     ` Jarkko Sakkinen
2023-09-27 10:28   ` Huang, Kai
2023-10-03  4:49     ` Haitao Huang
2023-10-03 20:03       ` Huang, Kai
2023-10-04 15:24         ` Haitao Huang
2023-10-04 21:05           ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 07/18] x86/sgx: Introduce RECLAIM_IN_PROGRESS state Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:13   ` Jarkko Sakkinen
2023-09-25 17:13     ` Jarkko Sakkinen
2023-09-27 10:42   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 08/18] x86/sgx: Use a list to track to-be-reclaimed pages Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-28  9:28   ` Huang, Kai
2023-10-03  5:09     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 11:14   ` Huang, Kai
2023-09-27 15:35     ` Haitao Huang
2023-09-27 21:21       ` Huang, Kai
2023-09-29 15:06         ` Haitao Huang
2023-10-02 11:05           ` Huang, Kai
2023-09-27 11:35   ` Huang, Kai
2023-10-03  6:45     ` Haitao Huang
2023-10-03 20:07       ` Huang, Kai
2023-10-04 15:03         ` Haitao Huang
2023-10-04 21:13           ` Huang, Kai
2023-10-05  4:22             ` Haitao Huang
2023-10-05  6:49               ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 10/18] x86/sgx: Add EPC page flags to identify owner types Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 11/18] x86/sgx: store unreclaimable pages in LRU lists Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 11:57   ` Huang, Kai
2023-10-03  5:42     ` Haitao Huang
2023-09-28  9:41   ` Huang, Kai
2023-10-03  5:15     ` Haitao Huang
2023-10-03 20:12       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-09 23:45   ` Huang, Kai
2023-10-10  0:23     ` Sean Christopherson
2023-10-10  0:50       ` Huang, Kai
2023-10-10  1:34         ` Huang, Kai
2023-10-10 16:49           ` Haitao Huang
2023-10-11  0:51             ` Huang, Kai
2023-10-12 13:27               ` Haitao Huang
2023-10-16 10:57                 ` Huang, Kai
2023-10-16 19:52                   ` Haitao Huang
2023-10-16 21:09                     ` Huang, Kai
2023-10-17  0:10                       ` Haitao Huang
2023-10-17  1:34                         ` Huang, Kai
2023-10-17 12:58                           ` Haitao Huang
2023-10-17 18:54                             ` Michal Koutný
2023-10-17 19:13                               ` Michal Koutný
2023-10-18  4:39                                 ` Haitao Huang
2023-10-18  4:37                               ` Haitao Huang
2023-10-18 13:55                                 ` Dave Hansen
2023-10-18 15:26                                   ` Haitao Huang
2023-10-18 15:37                                     ` Dave Hansen
2023-10-18 15:52                                       ` Michal Koutný
2023-10-18 16:25                                         ` Haitao Huang
2023-10-16 21:32                     ` Sean Christopherson
2023-10-17  0:09                       ` Haitao Huang
2023-10-17 15:43                         ` Sean Christopherson
2023-10-17 11:49                       ` Mikko Ylinen
2023-10-11  1:14             ` Huang, Kai
2023-10-16 11:02               ` Huang, Kai
2023-10-10  1:42       ` Haitao Huang
2023-10-10  2:23         ` Huang, Kai
2023-10-10 13:26           ` Haitao Huang
2023-10-11  0:01             ` Sean Christopherson
2023-10-11 15:02               ` Haitao Huang
2023-10-10  1:04     ` Haitao Huang
2023-10-10  1:18       ` Huang, Kai
2023-10-10  1:38         ` Haitao Huang [this message]
2023-10-10  2:12           ` Huang, Kai
2023-10-10 17:05             ` Haitao Huang
2023-10-11  0:31               ` Huang, Kai
2023-10-11 16:04                 ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-05 12:24   ` Huang, Kai
2023-10-05 19:23     ` Haitao Huang
2023-10-05 20:25       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 14/18] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-05 12:30   ` Huang, Kai
2023-10-05 19:33     ` Haitao Huang
2023-10-05 20:38       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:15   ` Jarkko Sakkinen
2023-09-25 17:15     ` Jarkko Sakkinen
2023-10-05 21:01   ` Huang, Kai
2023-10-10  0:12   ` Huang, Kai
2023-10-10  0:16   ` Huang, Kai
2023-10-10  0:26   ` Huang, Kai
2023-10-22 18:26     ` Haitao Huang
2023-10-10  9:19   ` Huang, Kai
2023-10-10  9:32   ` Huang, Kai
2023-10-17 18:54   ` Michal Koutný
2023-10-19 16:05     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 17/18] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 18/18] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2023-09-23  3:06   ` Haitao Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.2ckr5yetwjvjmi@hhuan26-mobl.amr.corp.intel.com \
    --to=haitao.huang@linux.intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.