From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Dmitrii Kuvaiskii" <dmitrii.kuvaiskii@intel.com>,
<dave.hansen@linux.intel.com>, <kai.huang@intel.com>,
<haitao.huang@linux.intel.com>, <reinette.chatre@intel.com>,
<linux-sgx@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <mona.vij@intel.com>, <kailun.qin@intel.com>
Subject: Re: [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows
Date: Mon, 29 Apr 2024 16:06:39 +0300 [thread overview]
Message-ID: <D0WMNTCRUN00.TQHC8O6X6WI2@kernel.org> (raw)
In-Reply-To: <20240429104330.3636113-1-dmitrii.kuvaiskii@intel.com>
On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote:
> SGX runtimes such as Gramine may implement EDMM-based lazy allocation of
> enclave pages and may support MADV_DONTNEED semantics [1]. The former
> implies #PF-based page allocation, and the latter implies the usage of
> SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl.
>
> A trivial program like below (run under Gramine and with EDMM enabled)
> stresses these two flows in the SGX driver and hangs:
>
> /* repeatedly touch different enclave pages at random and mix with
> * `madvise(MADV_DONTNEED)` to stress EAUG/EREMOVE flows */
> static void* thread_func(void* arg) {
> size_t num_pages = 0xA000 / page_size;
> for (int i = 0; i < 5000; i++) {
> size_t page = get_random_ulong() % num_pages;
> char data = READ_ONCE(((char*)arg)[page * page_size]);
>
> page = get_random_ulong() % num_pages;
> madvise(arg + page * page_size, page_size, MADV_DONTNEED);
> }
> }
>
> addr = mmap(NULL, 0xA000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS, -1, 0);
> pthread_t threads[16];
> for (int i = 0; i < 16; i++)
> pthread_create(&threads[i], NULL, thread_func, addr);
I'm not convinced that kernel is the problem here but it could be also
how Gramine is implemented.
So maybe you could make a better case of that. The example looks a bit
artificial to me.
>
> This program uncovers two data races in the SGX driver. The remaining
> patches describe and fix these races.
>
> I performed several stress tests to verify that there are no other data
> races (at least with the test program above):
>
> - On Icelake server with 128GB of PRMRR (EPC), without madvise(). This
> stresses the first data race. A Gramine SGX test suite running in the
> background for additional stressing. Result: 1,000 runs without hangs
> (result without the first bug fix: hangs every time).
> - On Icelake server with 128GB of PRMRR (EPC), with madvise(). This
> stresses the second data race. A Gramine SGX test suite running in the
> background for additional stressing. Result: 1,000 runs without hangs
> (result with the first bug fix but without the second bug fix: hangs
> approx. once in 50 runs).
> - On Icelake server with 4GB of PRMRR (EPC), with madvise(). This
> additionally stresses the enclave page swapping flows. Two Gramine SGX
> test suites running in the background for additional stressing of
> swapping (I observe 100% CPU utilization from ksgxd which confirms that
> swapping happens). Result: 1,000 runs without hangs.
>
> (Sorry for the previous copy of this email, accidentally sent to
> stable@vger.kernel.org. Failed to use `--suppress-cc` during a test send.)
>
> Dmitrii Kuvaiskii (2):
> x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
> x86/sgx: Resolve EREMOVE page vs EAUG page data race
>
> arch/x86/kernel/cpu/sgx/encl.c | 10 +++++++---
> arch/x86/kernel/cpu/sgx/encl.h | 3 +++
> arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
> 3 files changed, 11 insertions(+), 3 deletions(-)
BR, Jarkko
next prev parent reply other threads:[~2024-04-29 13:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-29 10:43 [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-04-29 10:43 ` [PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-04-29 13:04 ` Jarkko Sakkinen
2024-04-29 13:22 ` Jarkko Sakkinen
2024-04-29 13:24 ` Jarkko Sakkinen
2024-04-30 14:37 ` Dmitrii Kuvaiskii
2024-05-10 23:47 ` Reinette Chatre
2024-04-29 10:43 ` [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2024-04-29 13:11 ` Jarkko Sakkinen
2024-04-30 14:38 ` Dmitrii Kuvaiskii
2024-05-10 23:47 ` Reinette Chatre
2024-04-29 13:06 ` Jarkko Sakkinen [this message]
2024-04-30 14:35 ` [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D0WMNTCRUN00.TQHC8O6X6WI2@kernel.org \
--to=jarkko@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=dmitrii.kuvaiskii@intel.com \
--cc=haitao.huang@linux.intel.com \
--cc=kai.huang@intel.com \
--cc=kailun.qin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=mona.vij@intel.com \
--cc=reinette.chatre@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).