From: Jarkko Sakkinen <jarkko@kernel.org>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: dave.hansen@linux.intel.com, tglx@linutronix.de, bp@alien8.de,
mingo@redhat.com, hpa@zytor.com, md.iqbal.hossain@intel.com,
haitao.huang@intel.com, linux-sgx@vger.kernel.org,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2] x86/sgx: Reduce delay and interference of enclave release
Date: Tue, 1 Nov 2022 03:27:20 +0200 [thread overview]
Message-ID: <Y2B1+NM2ONHGPgwj@kernel.org> (raw)
In-Reply-To: <00efa80dd9e35dc85753e1c5edb0344ac07bb1f0.1667236485.git.reinette.chatre@intel.com>
On Mon, Oct 31, 2022 at 10:29:58AM -0700, Reinette Chatre wrote:
> commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
> releasing large enclaves") introduced a cond_resched() during enclave
> release where the EREMOVE instruction is applied to every 4k enclave
> page. Giving other tasks an opportunity to run while tearing down a
> large enclave placates the soft lockup detector but Iqbal found
> that the fix causes a 25% performance degradation of a workload
> run using Gramine.
>
> Gramine maintains a 1:1 mapping between processes and SGX enclaves.
> That means if a workload in an enclave creates a subprocess then
> Gramine creates a duplicate enclave for that subprocess to run in.
> The consequence is that the release of the enclave used to run
> the subprocess can impact the performance of the workload that is
> run in the original enclave, especially in large enclaves when
> SGX2 is not in use.
>
> The workload run by Iqbal behaves as follows:
> Create enclave (enclave "A")
> /* Initialize workload in enclave "A" */
> Create enclave (enclave "B")
> /* Run subprocess in enclave "B" and send result to enclave "A" */
> Release enclave (enclave "B")
> /* Run workload in enclave "A" */
> Release enclave (enclave "A")
>
> The performance impact of releasing enclave "B" in the above scenario
> is amplified when there is a lot of SGX memory and the enclave size
> matches the SGX memory. When there is 128GB SGX memory and an enclave
> size of 128GB, from the time enclave "B" starts the 128GB SGX memory
> is oversubscribed with a combined demand for 256GB from the two
> enclaves.
>
> Before commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
> releasing large enclaves") enclave release was done in a tight loop
> without giving other tasks a chance to run. Even though the system
> experienced soft lockups the workload (run in enclave "A") obtained
> good performance numbers because when the workload started running
> there was no interference.
>
> Commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
> releasing large enclaves") gave other tasks opportunity to run while an
> enclave is released. The impact of this in this scenario is that while
> enclave "B" is released and needing to access each page that belongs
> to it in order to run the SGX EREMOVE instruction on it, enclave "A"
> is attempting to run the workload needing to access the enclave
> pages that belong to it. This causes a lot of swapping due to the
> demand for the oversubscribed SGX memory. Longer latencies are
> experienced by the workload in enclave "A" while enclave "B" is
> released.
>
> Improve the performance of enclave release while still avoiding the
> soft lockup detector with two enhancements:
> - Only call cond_resched() after XA_CHECK_SCHED iterations.
> - Use the xarray advanced API to keep the xarray locked for
> XA_CHECK_SCHED iterations instead of locking and unlocking
> at every iteration.
>
> This batching solution is copied from sgx_encl_may_map() that
> also iterates through all enclave pages using this technique.
>
> With this enhancement the workload experiences a 5%
> performance degradation when compared to a kernel without
> commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
> releasing large enclaves"), an improvement to the reported 25%
> degradation, while still placating the soft lockup detector.
>
> Scenarios with poor performance are still possible even with these
> enhancements. For example, short workloads creating sub processes
> while running in large enclaves. Further performance improvements
> are pursued in user space through avoiding to create duplicate enclaves
> for certain sub processes, and using SGX2 that will do lazy allocation
> of pages as needed so enclaves created for sub processes start quickly
> and release quickly.
>
> Fixes: 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves")
> Reported-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com>
> Tested-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> I do not know if this qualifies as stable material.
>
> Changes since V1:
> - V1: https://lore.kernel.org/lkml/06a5f478d3bfaa57954954c82dd5d4040450171d.1666130846.git.reinette.chatre@intel.com/
> - Use local variable for max index instead of open code in loop. (Jarkko)
> - Send to broader X86 audience.
>
> arch/x86/kernel/cpu/sgx/encl.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 1ec20807de1e..2c258255a629 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -680,11 +680,15 @@ const struct vm_operations_struct sgx_vm_ops = {
> void sgx_encl_release(struct kref *ref)
> {
> struct sgx_encl *encl = container_of(ref, struct sgx_encl, refcount);
> + unsigned long max_page_index = PFN_DOWN(encl->base + encl->size - 1);
> struct sgx_va_page *va_page;
> struct sgx_encl_page *entry;
> - unsigned long index;
> + unsigned long count = 0;
> +
> + XA_STATE(xas, &encl->page_array, PFN_DOWN(encl->base));
>
> - xa_for_each(&encl->page_array, index, entry) {
> + xas_lock(&xas);
> + xas_for_each(&xas, entry, max_page_index) {
> if (entry->epc_page) {
> /*
> * The page and its radix tree entry cannot be freed
> @@ -699,9 +703,20 @@ void sgx_encl_release(struct kref *ref)
> }
>
> kfree(entry);
> - /* Invoke scheduler to prevent soft lockups. */
> - cond_resched();
> + /*
> + * Invoke scheduler on every XA_CHECK_SCHED iteration
> + * to prevent soft lockups.
> + */
> + if (!(++count % XA_CHECK_SCHED)) {
> + xas_pause(&xas);
> + xas_unlock(&xas);
> +
> + cond_resched();
> +
> + xas_lock(&xas);
> + }
> }
> + xas_unlock(&xas);
>
> xa_destroy(&encl->page_array);
>
> --
> 2.34.1
>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
prev parent reply other threads:[~2022-11-01 1:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 17:29 [PATCH V2] x86/sgx: Reduce delay and interference of enclave release Reinette Chatre
2022-10-31 20:42 ` Dave Hansen
2022-11-01 1:28 ` Jarkko Sakkinen
2022-11-01 1:27 ` Jarkko Sakkinen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2B1+NM2ONHGPgwj@kernel.org \
--to=jarkko@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=haitao.huang@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=md.iqbal.hossain@intel.com \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).