All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Haitao Huang <haitao.huang@linux.intel.com>
Cc: Kai Huang <kai.huang@intel.com>, Bo Zhang <zhanb@microsoft.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"yangjie@microsoft.com" <yangjie@microsoft.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	Zhiquan1 Li <zhiquan1.li@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"tj@kernel.org" <tj@kernel.org>,
	"anakrish@microsoft.com" <anakrish@microsoft.com>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"mikko.ylinen@linux.intel.com" <mikko.ylinen@linux.intel.com>,
	Sohil Mehta <sohil.mehta@intel.com>,
	"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>
Subject: Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
Date: Mon, 16 Oct 2023 14:32:31 -0700	[thread overview]
Message-ID: <ZS2r7-EAEovpV4BN@google.com> (raw)
In-Reply-To: <op.2cxatlafwjvjmi@hhuan26-mobl.amr.corp.intel.com>

On Mon, Oct 16, 2023, Haitao Huang wrote:
> From this perspective, I think the current implementation is "well-defined":
> EPC cgroup limits for VMs are only enforced at VM launch time, not runtime.
> In practice,  SGX VM can be launched only with fixed EPC size and all those
> EPCs are fully committed to the VM once launched.

Fully committed doesn't mean those numbers are reflected in the cgroup.  A VM
scheduler can easily "commit" EPC to a guest, but allocate EPC on demand, i.e.
when the guest attempts to actually access a page.  Preallocating memory isn't
free, e.g. it can slow down guest boot, so it's entirely reasonable to have virtual
EPC be allocated on-demand.  Enforcing at launch time doesn't work for such setups,
because from the cgroup's perspective, the VM is using 0 pages of EPC at launch.

> Because of that, I imagine people are using VMs to primarily partition the
> physical EPCs, i.e, the static size itself is the 'limit' for the workload of
> a single VM and not expecting EPCs taken away at runtime.

If everything goes exactly as planned, sure.  But it's not hard to imagine some
configuration change way up the stack resulting in the hard limit for an EPC cgroup
being lowered.

> So killing does not really add much value for the existing usages IIUC.

As I said earlier, the behavior doesn't have to result in terminating a VM, e.g.
the virtual EPC code could provide a knob to send a signal/notification if the
owning cgroup has gone above the limit and the VM is targeted for forced reclaim.

> That said, I don't anticipate adding the enforcement of killing VMs at
> runtime would break such usages as admin/user can simply choose to set the
> limit equal to the static size to launch the VM and forget about it.
> 
> Given that, I'll propose an add-on patch to this series as RFC and have some
> feedback from community before we decide if that needs be included in first
> version or we can skip it until we have EPC reclaiming for VMs.

Gracefully *swapping* virtual EPC isn't required for oversubscribing virtual EPC.
Think of it like airlines overselling tickets.  The airline sells more tickets
than they have seats, and banks on some passengers canceling.  If too many people
show up, the airline doesn't swap passengers to the cargo bay, they just shunt them
to a different plane.

The same could be easily be done for hosts and virtual EPC.  E.g. if every VM
*might* use 1GiB, but in practice 99% of VMs only consume 128MiB, then it's not
too crazy to advertise 1GiB to each VM, but only actually carve out 256MiB per VM
in order to pack more VMs on a host.  If the host needs to free up EPC, then the
most problematic VMs can be migrated to a different host.

Genuinely curious, who is asking for EPC cgroup support that *isn't* running VMs?
AFAIK, these days, SGX is primarily targeted at cloud.  I assume virtual EPC is
the primary use case for an EPC cgroup.

I don't have any skin in the game beyond my name being attached to some of the
patches, i.e. I certainly won't stand in the way.  I just don't understand why
you would go through all the effort of adding an EPC cgroup and then not go the
extra few steps to enforce limits for virtual EPC.  Compared to the complexity
of the rest of the series, that little bit seems quite trivial.

  parent reply	other threads:[~2023-10-16 21:32 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-23  3:06 [PATCH v5 00/18] Add Cgroup support for SGX EPC memory Haitao Huang
2023-09-23  3:06 ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 01/18] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:09   ` Jarkko Sakkinen
2023-09-25 17:09     ` Jarkko Sakkinen
2023-09-26  3:04     ` Haitao Huang
2023-09-26 13:10       ` Jarkko Sakkinen
2023-09-26 13:10         ` Jarkko Sakkinen
2023-09-26 13:13         ` Jarkko Sakkinen
2023-09-26 13:13           ` Jarkko Sakkinen
2023-09-27  1:56           ` Haitao Huang
2023-10-02 22:47             ` Jarkko Sakkinen
2023-10-02 22:55               ` Jarkko Sakkinen
2023-10-04 15:45                 ` Haitao Huang
2023-10-04 17:18                   ` Tejun Heo
2023-09-27  9:20   ` Huang, Kai
2023-10-03 14:29     ` Haitao Huang
2023-10-17 18:55   ` Michal Koutný
2023-09-23  3:06 ` [PATCH v5 02/18] cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 18:50   ` Tejun Heo
2023-09-25 18:50     ` Tejun Heo
2023-09-28  3:59   ` Huang, Kai
2023-10-03  7:00     ` Haitao Huang
2023-10-03 19:33       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 03/18] x86/sgx: Add sgx_epc_lru_lists to encapsulate LRU lists Haitao Huang
2023-09-23  3:06 ` [PATCH v5 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 05/18] x86/sgx: Store reclaimable EPC pages in sgx_epc_lru_lists Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 10:14   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 06/18] x86/sgx: Introduce EPC page states Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:11   ` Jarkko Sakkinen
2023-09-25 17:11     ` Jarkko Sakkinen
2023-09-27 10:28   ` Huang, Kai
2023-10-03  4:49     ` Haitao Huang
2023-10-03 20:03       ` Huang, Kai
2023-10-04 15:24         ` Haitao Huang
2023-10-04 21:05           ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 07/18] x86/sgx: Introduce RECLAIM_IN_PROGRESS state Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:13   ` Jarkko Sakkinen
2023-09-25 17:13     ` Jarkko Sakkinen
2023-09-27 10:42   ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 08/18] x86/sgx: Use a list to track to-be-reclaimed pages Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-28  9:28   ` Huang, Kai
2023-10-03  5:09     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 11:14   ` Huang, Kai
2023-09-27 15:35     ` Haitao Huang
2023-09-27 21:21       ` Huang, Kai
2023-09-29 15:06         ` Haitao Huang
2023-10-02 11:05           ` Huang, Kai
2023-09-27 11:35   ` Huang, Kai
2023-10-03  6:45     ` Haitao Huang
2023-10-03 20:07       ` Huang, Kai
2023-10-04 15:03         ` Haitao Huang
2023-10-04 21:13           ` Huang, Kai
2023-10-05  4:22             ` Haitao Huang
2023-10-05  6:49               ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 10/18] x86/sgx: Add EPC page flags to identify owner types Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 11/18] x86/sgx: store unreclaimable pages in LRU lists Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-27 11:57   ` Huang, Kai
2023-10-03  5:42     ` Haitao Huang
2023-09-28  9:41   ` Huang, Kai
2023-10-03  5:15     ` Haitao Huang
2023-10-03 20:12       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-09 23:45   ` Huang, Kai
2023-10-10  0:23     ` Sean Christopherson
2023-10-10  0:50       ` Huang, Kai
2023-10-10  1:34         ` Huang, Kai
2023-10-10 16:49           ` Haitao Huang
2023-10-11  0:51             ` Huang, Kai
2023-10-12 13:27               ` Haitao Huang
2023-10-16 10:57                 ` Huang, Kai
2023-10-16 19:52                   ` Haitao Huang
2023-10-16 21:09                     ` Huang, Kai
2023-10-17  0:10                       ` Haitao Huang
2023-10-17  1:34                         ` Huang, Kai
2023-10-17 12:58                           ` Haitao Huang
2023-10-17 18:54                             ` Michal Koutný
2023-10-17 19:13                               ` Michal Koutný
2023-10-18  4:39                                 ` Haitao Huang
2023-10-18  4:37                               ` Haitao Huang
2023-10-18 13:55                                 ` Dave Hansen
2023-10-18 15:26                                   ` Haitao Huang
2023-10-18 15:37                                     ` Dave Hansen
2023-10-18 15:52                                       ` Michal Koutný
2023-10-18 16:25                                         ` Haitao Huang
2023-10-16 21:32                     ` Sean Christopherson [this message]
2023-10-17  0:09                       ` Haitao Huang
2023-10-17 15:43                         ` Sean Christopherson
2023-10-17 11:49                       ` Mikko Ylinen
2023-10-11  1:14             ` Huang, Kai
2023-10-16 11:02               ` Huang, Kai
2023-10-10  1:42       ` Haitao Huang
2023-10-10  2:23         ` Huang, Kai
2023-10-10 13:26           ` Haitao Huang
2023-10-11  0:01             ` Sean Christopherson
2023-10-11 15:02               ` Haitao Huang
2023-10-10  1:04     ` Haitao Huang
2023-10-10  1:18       ` Huang, Kai
2023-10-10  1:38         ` Haitao Huang
2023-10-10  2:12           ` Huang, Kai
2023-10-10 17:05             ` Haitao Huang
2023-10-11  0:31               ` Huang, Kai
2023-10-11 16:04                 ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-05 12:24   ` Huang, Kai
2023-10-05 19:23     ` Haitao Huang
2023-10-05 20:25       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 14/18] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-10-05 12:30   ` Huang, Kai
2023-10-05 19:33     ` Haitao Huang
2023-10-05 20:38       ` Huang, Kai
2023-09-23  3:06 ` [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-25 17:15   ` Jarkko Sakkinen
2023-09-25 17:15     ` Jarkko Sakkinen
2023-10-05 21:01   ` Huang, Kai
2023-10-10  0:12   ` Huang, Kai
2023-10-10  0:16   ` Huang, Kai
2023-10-10  0:26   ` Huang, Kai
2023-10-22 18:26     ` Haitao Huang
2023-10-10  9:19   ` Huang, Kai
2023-10-10  9:32   ` Huang, Kai
2023-10-17 18:54   ` Michal Koutný
2023-10-19 16:05     ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 17/18] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2023-09-23  3:06   ` Haitao Huang
2023-09-23  3:06 ` [PATCH v5 18/18] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2023-09-23  3:06   ` Haitao Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS2r7-EAEovpV4BN@google.com \
    --to=seanjc@google.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=haitao.huang@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.