linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Haitao Huang" <haitao.huang@linux.intel.com>
To: "hpa@zytor.com" <hpa@zytor.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"Mehta, Sohil" <sohil.mehta@intel.com>,
	"tj@kernel.org" <tj@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"Huang, Kai" <kai.huang@intel.com>
Cc: "kristen@linux.intel.com" <kristen@linux.intel.com>,
	"yangjie@microsoft.com" <yangjie@microsoft.com>,
	"Li, Zhiquan1" <zhiquan1.li@intel.com>, "Christopherson,,
	Sean" <seanjc@google.com>,
	"mikko.ylinen@linux.intel.com" <mikko.ylinen@linux.intel.com>,
	"Zhang, Bo" <zhanb@microsoft.com>,
	"anakrish@microsoft.com" <anakrish@microsoft.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
Date: Mon, 09 Oct 2023 20:04:54 -0500	[thread overview]
Message-ID: <op.2ckqmgs9wjvjmi@hhuan26-mobl.amr.corp.intel.com> (raw)
In-Reply-To: <1b265d0c9dfe17de2782962ed26a99cc9d330138.camel@intel.com>

On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai <kai.huang@intel.com> wrote:

> On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote:
>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>>
>> Introduce the OOM path for killing an enclave with a reclaimer that is  
>> no
>> longer able to reclaim enough EPC pages. Find a victim enclave, which
>> will be an enclave with only "unreclaimable" EPC pages left in the
>> cgroup LRU lists. Once a victim is identified, mark the enclave as OOM
>> and zap the enclave's entire page range, and drain all mm references in
>> encl->mm_list. Block allocating any EPC pages in #PF handler, or
>> reloading any pages in all paths, or creating any new mappings.
>>
>> The OOM killing path may race with the reclaimers: in some cases, the
>> victim enclave is in the process of reclaiming the last EPC pages when
>> OOM happens, that is, all pages other than SECS and VA pages are in
>> RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to
>> the enclave backing, VA pages as well as SECS. So the OOM killer does
>> not directly release those enclave resources, instead, it lets all
>> reclaiming in progress to finish, and relies (as currently done) on
>> kref_put on encl->refcount to trigger sgx_encl_release() to do the
>> final cleanup.
>>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
>> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
>> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com>
>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>> Cc: Sean Christopherson <seanjc@google.com>
>> ---
>> V5:
>> - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY
>>
>> V4:
>> - Updates for patch reordering and typo fixes.
>>
>> V3:
>> - Rebased to use the new VMA_ITERATOR to zap VMAs.
>> - Fixed the racing cases by blocking new page allocation/mapping and
>> reloading when enclave is marked for OOM. And do not release any enclave
>> resources other than draining mm_list entries, and let pages in
>> RECLAIMING_IN_PROGRESS to be reaped by reclaimers.
>> - Due to above changes, also removed the no-longer needed encl->lock in
>> the OOM path which was causing deadlocks reported by the lock prover.
>>
>
> [...]
>
>> +
>> +/**
>> + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU
>> + * @lru:	LRU that is low
>> + *
>> + * Return:	%true if a victim was found and kicked.
>> + */
>> +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru)
>> +{
>> +	struct sgx_epc_page *victim;
>> +
>> +	spin_lock(&lru->lock);
>> +	victim = sgx_oom_get_victim(lru);
>> +	spin_unlock(&lru->lock);
>> +
>> +	if (!victim)
>> +		return false;
>> +
>> +	if (victim->flags & SGX_EPC_OWNER_PAGE)
>> +		return sgx_oom_encl_page(victim->encl_page);
>> +
>> +	if (victim->flags & SGX_EPC_OWNER_ENCL)
>> +		return sgx_oom_encl(victim->encl);
>
> I hate to bring this up, at least at this stage, but I am wondering why  
> we need
> to put VA and SECS pages to the unreclaimable list, but cannot keep an
> "enclave_list" instead?
>
> So by looking the patch (" x86/sgx: Limit process EPC usage with misc  
> cgroup
> controller"), if I am not missing anything, the whole "unreclaimable"  
> list is
> just used to find the victim enclave when OOM needs to be done.  Thus, I  
> don't
> see why "enclave_list" cannot be used to achieve this.
>
> The reason that I am asking is because it seems using "enclave_list" we  
> can
> simplify the code.  At least the patches related to track VA/SECS pages,  
> and the
> SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated  
> completely.  
> Using "enclave_list", I guess you just need to put the enclave to the  
> current
> EPC cgroup when SECS page is allocated.
>
Later the hosting process could migrated/reassigned to another cgroup?
What to do when the new cgroup is OOM?

Thanks
Haitao

  parent reply	other threads:[~2023-10-10  1:06 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-23  3:06 [PATCH v5 00/18] Add Cgroup support for SGX EPC memory Haitao Huang
2023-09-23  3:06 ` [PATCH v5 01/18] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2023-09-25 17:09   ` Jarkko Sakkinen
2023-09-26  3:04     ` Haitao Huang
2023-09-26 13:10       ` Jarkko Sakkinen
2023-09-26 13:13         ` Jarkko Sakkinen
2023-09-27  1:56           ` Haitao Huang
2023-10-02 22:47             ` Jarkko Sakkinen
2023-10-02 22:55               ` Jarkko Sakkinen
2023-10-04 15:45                 ` Haitao Huang
2023-10-04 17:18                   ` Tejun Heo
2023-09-27  9:20   ` Huang, Kai
2023-10-03 14:29     ` Haitao Huang
2023-10-17 18:55   ` Michal Koutný
2023-09-23  3:06 ` [PATCH v5 02/18] cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver Haitao Huang
2023-09-25 18:50   ` Tejun Heo
2023-09-28  3:59   ` Huang, Kai
2023-10-03  7:00     ` Haitao Huang
2023-10-03 19:33       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 03/18] x86/sgx: Add sgx_epc_lru_lists to encapsulate LRU lists Haitao Huang
2023-09-23  3:06 ` [PATCH v5 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list Haitao Huang
2023-09-23  3:06 ` [PATCH v5 05/18] x86/sgx: Store reclaimable EPC pages in sgx_epc_lru_lists Haitao Huang
2023-09-27 10:14   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 06/18] x86/sgx: Introduce EPC page states Haitao Huang
2023-09-25 17:11   ` Jarkko Sakkinen
2023-09-27 10:28   ` Huang, Kai
2023-10-03  4:49     ` Haitao Huang
2023-10-03 20:03       ` Huang, Kai
2023-10-04 15:24         ` Haitao Huang
2023-10-04 21:05           ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 07/18] x86/sgx: Introduce RECLAIM_IN_PROGRESS state Haitao Huang
2023-09-25 17:13   ` Jarkko Sakkinen
2023-09-27 10:42   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 08/18] x86/sgx: Use a list to track to-be-reclaimed pages Haitao Huang
2023-09-28  9:28   ` Huang, Kai
2023-10-03  5:09     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages Haitao Huang
2023-09-27 11:14   ` Huang, Kai
2023-09-27 15:35     ` Haitao Huang
2023-09-27 21:21       ` Huang, Kai
2023-09-29 15:06         ` Haitao Huang
2023-10-02 11:05           ` Huang, Kai
2023-09-27 11:35   ` Huang, Kai
2023-10-03  6:45     ` Haitao Huang
2023-10-03 20:07       ` Huang, Kai
2023-10-04 15:03         ` Haitao Huang
2023-10-04 21:13           ` Huang, Kai
2023-10-05  4:22             ` Haitao Huang
2023-10-05  6:49               ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 10/18] x86/sgx: Add EPC page flags to identify owner types Haitao Huang
2023-09-23  3:06 ` [PATCH v5 11/18] x86/sgx: store unreclaimable pages in LRU lists Haitao Huang
2023-09-27 11:57   ` Huang, Kai
2023-10-03  5:42     ` Haitao Huang
2023-09-28  9:41   ` Huang, Kai
2023-10-03  5:15     ` Haitao Huang
2023-10-03 20:12       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Haitao Huang
2023-10-09 23:45   ` Huang, Kai
2023-10-10  0:23     ` Sean Christopherson
2023-10-10  0:50       ` Huang, Kai
2023-10-10  1:34         ` Huang, Kai
2023-10-10 16:49           ` Haitao Huang
2023-10-11  0:51             ` Huang, Kai
2023-10-12 13:27               ` Haitao Huang
2023-10-16 10:57                 ` Huang, Kai
2023-10-16 19:52                   ` Haitao Huang
2023-10-16 21:09                     ` Huang, Kai
2023-10-17  0:10                       ` Haitao Huang
2023-10-17  1:34                         ` Huang, Kai
2023-10-17 12:58                           ` Haitao Huang
2023-10-17 18:54                             ` Michal Koutný
2023-10-17 19:13                               ` Michal Koutný
2023-10-18  4:39                                 ` Haitao Huang
2023-10-18  4:37                               ` Haitao Huang
2023-10-18 13:55                                 ` Dave Hansen
2023-10-18 15:26                                   ` Haitao Huang
2023-10-18 15:37                                     ` Dave Hansen
2023-10-18 15:52                                       ` Michal Koutný
2023-10-18 16:25                                         ` Haitao Huang
2023-10-16 21:32                     ` Sean Christopherson
2023-10-17  0:09                       ` Haitao Huang
2023-10-17 15:43                         ` Sean Christopherson
2023-10-17 11:49                       ` Mikko Ylinen
2023-10-11  1:14             ` Huang, Kai
2023-10-16 11:02               ` Huang, Kai
2023-10-10  1:42       ` Haitao Huang
2023-10-10  2:23         ` Huang, Kai
2023-10-10 13:26           ` Haitao Huang
2023-10-11  0:01             ` Sean Christopherson
2023-10-11 15:02               ` Haitao Huang
2023-10-10  1:04     ` Haitao Huang [this message]
2023-10-10  1:18       ` Huang, Kai
2023-10-10  1:38         ` Haitao Huang
2023-10-10  2:12           ` Huang, Kai
2023-10-10 17:05             ` Haitao Huang
2023-10-11  0:31               ` Huang, Kai
2023-10-11 16:04                 ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Haitao Huang
2023-10-05 12:24   ` Huang, Kai
2023-10-05 19:23     ` Haitao Huang
2023-10-05 20:25       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 14/18] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Haitao Huang
2023-09-23  3:06 ` [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs Haitao Huang
2023-10-05 12:30   ` Huang, Kai
2023-10-05 19:33     ` Haitao Huang
2023-10-05 20:38       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller Haitao Huang
2023-09-25 17:15   ` Jarkko Sakkinen
2023-10-05 21:01   ` Huang, Kai
2023-10-10  0:12   ` Huang, Kai
2023-10-10  0:16   ` Huang, Kai
2023-10-10  0:26   ` Huang, Kai
2023-10-22 18:26     ` Haitao Huang
2023-10-10  9:19   ` Huang, Kai
2023-10-10  9:32   ` Huang, Kai
2023-10-17 18:54   ` Michal Koutný
2023-10-19 16:05     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 17/18] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2023-09-23  3:06 ` [PATCH v5 18/18] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.2ckqmgs9wjvjmi@hhuan26-mobl.amr.corp.intel.com \
    --to=haitao.huang@linux.intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).